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MODIFIED HIV ENV POLYPEPTIDES 



Technical Field 

5 The invention relates generally to modified HIV envelope (Env) polypeptides which 

are useful as immunizing agents or for generating an immune response in a subject, for 
example a cellular immune response or a protective immune response. More particularly, the 
invention relates Env polypeptides such as gpl20, gp!40 or gpl60, wherein at least one of 
the native P-sheet configurations has been modified. The invention also pertains to methods 
10 of using these polypeptides to elicit an immune response against a broad range of HIV 
subtypes. 

Background of the Invention 

The human immunodeficiency virus (HTV-1, also referred to as HTLV-III, LAV or 

15 HTLV-III/LAV) is the etiological agent of the acquired immune deficiency syndrome (AIDS) 
and related disorders, (see, e.g., Barre-Sinoussi, et al., (1983) Science 220:868-871; Gallo et 
al. (1984) Science 224:500-503; Levy et al., (1984) Science 225:840-842; Siegal et al., (1981) 
N. Engl. J. Med. 305:1439-1444). AIDS patients usually have a long asymptomatic period 
followed by the progressive degeneration of the immune system and the central nervous 

20 system. Replication of the virus is highly regulated, and both latent and lytic infection of the 
CD4 positive helper subset of T-lymphocytes occur in tissue culture (Zagury et al., (1986) 
Science 23 1 :850-853). Molecular studies of HTV-1 show that it encodes a number of genes 
(Ratner et al., (1985) Nature 313:277-284; Sanchez-Pescador et al., (1985) Science 227:484- 
492), including three structural genes -- gag, pol and env that are common to all 

25 retroviruses. Nucleotide sequences from viral genomes of other retroviruses, particularly 

HIV-2 and simian immunodeficiency viruses, SIV (previously referred to as STLV-III), also 
contain these structural genes. (Guyader et al., (1987) Nature 326:662-669; Chakrabarti et 
al.,(1987)Mtfure 

The envelope protein of HIV- 1, HIV-2 and SIV is a glycoprotein of about 160 kd 
30 (gpl60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
form gp!20 and the integral membrane protein, gp41. The gp41 portion is anchored in the 
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membrane bilayer of virion, while the gpl20 segment protrudes into the surrounding 
environment. gpl20 and gp41 are more covalently associated and free gp!20 can be released 
from the surface of virions and infected cells. 

As depicted in Figure 1, crystallography studies of the gpl20 core polypeptide 
5 indicate that this polypeptide is folded into two major domains having certain emanating 
structures. The inner domain (inner with respect to the N and C terminus) features a two- 
helix, two-stranded bundle with a small five-stranded (J-sandwich at its termini-proximal end 
and a projection at the distal end from which the VI /V2 stem emanates. The outer domain is 
a staked double barrel that lies along side the inner domain so that the outer barrel and inner 

10 bundle axes are approximately parallel. Between the distal inner domain and the distal outer 
domain is a four-stranded bridging sheet which holds a peculiar minidomain in contact with, 
but distinct from, the inner, the outer domain, and the V1/V2 domain. The bridging sheet is 
composed of four p-strand structures (p-3, P-2, p-21, p-20, shown in Figure 1). The bridging 
region can be seen in Figure 1 packing primarily over the inner domain, although some 

1 5 surface residues of the outer domain, such as Phe 382, reach into the bridging sheet to form 
part of its hydrophobic core. 

The basic unit of the p-sheet conformation of the bridging sheet region is the P-strand 
which exists as a less tightly coiled helix, with 2.0 residues per turn. The P-strand 
conformation is only stable when incorporated into a P-sheet, where hydrogen bonds with 

20 close to optimal geometry are formed between the peptide groups on adjacent P-strands; the 
dipole moments of the strands are also aligned favorably. Side chains from adjacent residues 
of the same strand protrude from opposite sides of the sheet and do not interact with each 
other, but have significant interactions with their backbone and with the side chains of 
neighboring strands. For a general description of P-sheets, see, e.g., T.E. Creighton, Proteins: 

25 Structures and Molecular Properties (W.H. Freeman and Company, 1993); and A.L. 
Lehninger, Biochemistry (Worth Publishers, Inc., 1975). 

The gpl20 polypeptide is instrumental in mediating entry into the host cell Recent 
studies have indicated that binding of CD4 to gpl20 induces a conformational change in Env 
that allows for binding to a co-receptor (e.g, a chemokine receptor) and subsequent entry of 

30 the virus into the cell. (Wyatt, R., et al. (1998) Nature 393:705-71 1; Kwong, P., et al.(1998) 
Nature 393:648-659). Referring again to Figure 1, CD4 is bound into a depression formed at 
the interface of the outer domain, the inner domain and the bridging sheet of gpl20. 
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Immunogenicity of the gpl20 polypeptide has also been studied. For example, 
individuals infected by HIV-1 usually develop antibodies that can neutralize the virus in in 
vitro assays, and this response is directed primarily against linear neutralizing determinants in 
the third variable loop of gpl20 glycoprotein (Javaherian, K., et al. (1989) Proc. Natl. Acad. 
5 ScL 86:6786-6772; Matsushita, M. et al. (1988) J. Virol 62:2107-2144; Putney, S., et al. 
(1986) Science 234:1392-1395; Rushe, J. R., et al . (1988) Proc. Nat Acad. ScL USA 85: 
3198-3202.). However, these antibodies generally exhibit the ability to neutralize only a 
limited number of HIV-1 strains (Matthews, T. (1986) Proc. Natl Acad. Sci. USA. 83:9709- 
9713; Nara, P. L., et al. (1988) / Virol 62:2622-2628; Palker, T. J., et al. (1988) Proc. Natl 
10 Acad. Sci. USA. 85:1932-1936). Later in the course of HIV infection in humans, antibodies 
capable of neutralizing a wider range of HIV-1 isolates appear (Barre-Sinoussi, F., et al. 
(1983) Science 220:868-871; Robert-Guroff, M., et al. (1985) Nature (London) 316:72-74; 
Weis, R., et al. (1985) Nature (London) 316:69-72; Weis, R., et al. (1986) Nature (London) 
324:572-575). 

1 5 Recent work done by Stamatatos et al ( 1 998) AIDS Res Hum Retroviruses 

1 4(1 3): 1 129-39, shows that a deletion of the variable region 2 from a HIV-1 SF162 virus, which 
utilizes the CCR-5 co-receptor for virus entry, rendered the virus highly susceptible to serum- 
mediated neutralization. This V2 deleted virus was also neutralized by sera obtained from 
patients infected not only with clade B HIV-1 isolates but also with clade A, C, D and F HIV- 

20 1 isolates. However, deletion of the variable region 1 had no effect. Deletion of the variable 
regions 1 and 2 from a LAI isolate HIV-I iIIB also increased the susceptibility to neutralization 
by monoclonal antibodies whose epitopes are located within the V3 loop, the CD4-binding 
site, and conserved gpl20 regions (Wyatt, R., et al. (1995) J Virol 69:5723-5733). Rabbit 
immunogenicity studies done with the HIV-1 virus with deletions in the V 1/V2 and V3 

25 region from the LAI strain, which uses the CXCR4 co-receptor for virus entry, showed no 
improvement in the ability of Env to raise neutralizing antibodies (Leu et al. (1998) AIDS 
Res. and Human Retroviruses. 14:151-155). 

Further, a subset of the broadly reactive antibodies, found in most infected 
individuals, interferes with the binding of gp!20 and CD4 (Kang, C.-Y., et al. (1991) Proc. 

30 Natl Acad. ScL USA. 88:6171-6175; McDougal, J. S., et al. (1986) J. Immunol 137:2937- 
2944). Other antibodies are believed to bind to the chemokine receptor binding region after 
CD4 has bound to Env (Thali et al. (1993) /. Virol 67:3978-3988). The fact that neutralizing 
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antibodies generated during the course of HIV infection do not provide permanent antiviral 
effect may in part be due to the generation of "neutralization escapes" virus mutants and to 
the general decline in the host immune system associated with pathogenesis. In contrast, the 
presence of pre-existing neutralizing antibodies upon initial HIV-1 exposure will likely have 
5 a protective effect. 

It is widely thought that a successful vaccine should be able to induce a strong, 
broadly neutralizing antibody response against diverse HIV-1 strains (Montefiori and Evans 
(1999) AIDS Res. Hum. Ret. 15(8):689-698; Bolognesi, D.,P., et al. (1994) Ann. Int. Med. 
8:603-61 1; Haynes, B., F., et al. (1996) Science ;271: 324-328.). Neutralizing antibodies, by 

10 attaching to the incoming virions, can reduce or even prevent their infectivity for target cells 
and prevent the cell-to-cell spread of virus in tissue culture (Hu et al. (1992) Science 255:456- 
459; Burton, D.,R. and Montefiori, D. (1997) AIDS ll(suppl. A): 587-598). However as 
described above, antibodies directed against gpl20 do not generally exhibit broad antibody 
responses against different HIV strains. 

1 5 Currently, the focus of vaccine development, from the perspective of humoral 

immunity, is on the neutralization of primary isolates that utilize the CCR5 chemokine co- 
receptor believed to be important in virus entry (Zhu, T., et al. (1993) Science 261:1 179- 
1 181; Fiore, J., et al. (1994) Virology; 204:297-303). These viruses are generally much more 
resistant to antibody neutralization than T-cell line adapted strains that use the CXCR4 co- 

20 receptor, although both can be neutralized in vitro by certain broadly and potent acting 

monoclonal antibodies, such as IgGlbl2, 2G12 and 2F5 (Trkola, A., et al. (1995) J. Virol. 
69:6609-6617; D'Sousa PM., et al (1997) J. Infect. Dis. 175:1062-1075). These monoclonal 
antibodies are directed to the CD4 binding site, a glycosylation site and to the gp41 fusion 
domain, respectively. The problem that remains, however, is that it is not known how to 

25 induce antibodies of the appropriate specificity by vaccination. Antibodies (Abs) elicited by 
gpl20 glycoprotein from a given isolate are usually only able to neutralize closely related 
viruses generally from similar, usually from the same, HIV-1 subtype. 

Despite the above approaches, there remains a need for Env antigens that can elicit an 
immunological response (e.g., neutralizing and/or protective antibodies) in a subject against 

30 multiple HIV strains and subtypes, for example when administered as a vaccine. The present 
invention solves these and other problems by providing modified Env polypeptides (e.g., 
gpl20) to expose epitopes in or near the CD4 binding site. 
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Summary of the Invention 

In accordance with the present invention, modified HIV Env polypeptides are 
provided. In particular, deletions and/or mutations are made in one or more of the 4-P 
antiparallel-bridging sheet in the HIV Env polypeptide. In this way, enough structure is left 
5 to allow correct folding of the polypeptide, for example of gp 1 20, yet enough of the bridging 
sheet is removed to expose the CD4 groove, allowing an immune response to be generated 
against epitopes in or near the CD4 binding site of the Env polypeptide (eg., gpl20). 

In one aspect, the invention includes a polynucleotide encoding a modified HIV Env 
polypeptide wherein the polypeptide has at least one modified (e.g., deleted or replaced) 

10 amino acid residue deleted in the region corresponding to residues 421 to 436 relative to 
HXB-2, for example the constructs depicted in Figures 6-29 (SEQ ID NOs:3 to 26). In 
certain embodiments, the polynucleotide also has the region corresponding to residues 124- 
198 of the polypeptide HXB-2 (e.g., V1/V2) deleted and at least one amino acid deleted or 
replaced in the regions corresponding to the residues 1 19 to 123 and 199 to 210, relative to 

15 HXB-2. In other embodiments, these polynucleotides encode Env polypeptides having at 
least one amino acid of the small loop of the bridging sheet (e.g., amino acid residues 427 to 
429 relative to HXB-2) deleted or replaced. The amino acid sequences of the modified 
polypeptides encoded by the polynucleotides of the present invention can be based on any 
HIV variant, for example SF162. 

20 In another aspect, the invention includes immunogenic modified HIV Env 

polypeptides having at least one modified (e.g., deleted or replaced) amino acid residue 
deleted in the region corresponding to residues 421 to 436 relative to HXB-2, for example a 
deletion or replacement of one amino acids in the small loop region (e.g., amino acid residues 
427 to 429 relative to HXB-2). These polypeptides may have modifications (e.g., a deletion 

25 or a replacement) of at least one amino acid between about amino acid residue 420 and amino 
acid residue 436, relative to HXB-2 and, optionally, may have deletions or truncations of the 
VI and/or V2 regions. The immunogenic, modified polypeptides of the present invention can 
be based on any HIV variant, for example SF162. 

In another aspect, the invention includes a vaccine composition comprising any of the 

30 polynucleotides encoding modified Env polypeptides described above. Vaccine 

compositions comprising the modified Env polypeptides and, optionally, an adjuvant are also 
included in the invention. 
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In yet another aspect, the invention includes a method of inducing an immune 
response in subject comprising, administering one or more of the polynucleotides or 
constructs described above in an amount sufficient to induce an immune response in the 
subject. In certain embodiments, the method further comprises administering an adjuvant to 
5 the subject. 

In another aspect, the invention includes a method of inducing an immune response in 
a subject comprising administering a composition comprising any of the modified Env 
polypeptides described above and an adjuvant. The composition is administered in an 
amount sufficient to induce an immune response in the subject. 
10 In another aspect, the invention includes a method of inducing an immune response in 

a subject comprising 

(a) administering a first composition comprising any of the polynucleotides described 
above in a priming step and 

(b) administering a second composition comprising any of the modified Env 

15 polypeptides described above, as a booster, in an amount sufficient to induce an immune 
response in the subject. In certain embodiments, the first composition, the second 
composition or both the first and second compositions further comprise an adjuvant. 

These and other embodiments of the subject invention will readily occur to those of 
skill in the art in light of the disclosure herein. 

20 

Brief Description of the Drawings 

Figure 1 is a schematic depiction of the tertiary structure of the HIV-1 HXB ., Env gpl20 
polypeptide, as determined by crystallography studies. 

Figures 2A-C depict alignment of the amino acid sequence of wild-type HIV-1 HXB . 2 
25 Env gpl60 polypeptide (SEQ ID NO:l) with amino acid sequence of HIV variants SF162 
(shown as "162") (SEQ ID NO:2), SF2, CM236 and US4. Arrows indicate the regions that 
are deleted or replaced in the modified polypeptides. Black dots indicate conserved cysteine 
residues. The star indicates the position of the last amino acid in gpl20. 

Figures 3A-J depict alignment of nucleotide sequences of polynucleotides encoding 
30 modified Env polypeptides having VI /V2 deletions. The unmodified amino acid residues 

encoded by these sequences correspond to wildtype SF162 residues but are numbered relative 
to HXB-2. 
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Figures 4A-M depict alignment of nucleotide sequences of polynucleotides encoding 
modified Env polypeptides having deletions or replacements in the small loop. The 
unmodified amino acid residues encoded by these sequences correspond to wildtype SF162 
residues but are numbered relative to HXB-2. 
5 Figures 5A-N depict alignment of nucleotide sequences of polynucleotides encoding 

modified Env polypeptides having both VI /V2 deletions and, in addition, deletions or 
replacements in the small loop. The unmodified amino acid residues encoded by these 
sequences correspond to wildtype SF162 residues but are numbered relative to HXB-2. 

Figure 6 depicts the nucleotide sequence of the construct designated Vall20-Ala204 
10 (SEQIDNO:3). 

Figure 7 depicts the nucleotide sequence of the construct designated Vall20-Ile201 

(SEQ ID NO:4). 

Figure 8 depicts the nucleotide sequence of the construct designated Vall20-Ile201B 
(SEQ ID NO:5). 

1 5 Figure 9 depicts the nucleotide sequence of the construct designated Lysl 2 1 -Val200 

(SEQ ID NO:6). 

Figure 10 depicts the nucleotide sequence of the construct designated Leul22-Serl99 
(SEQ ID NO:7). 

Figure 1 1 depicts the nucleotide sequence of the construct designated Vall20-Thr202 

20 (SEQ ID NO:8). 

Figure 12 depicts the nucleotide sequence of the construct designated Trp427-Gly431 

(SEQ ID NO:9). 

Figure 13 depicts the nucleotide sequence of the construct designated Arg426-Gly43 1 
(SEQ ID NO: 10). 

25 Figure 14 depicts the nucleotide sequence of the construct designated Arg426- 

GIy431B(SEQ ID NO:ll). 

Figure 15 depicts the nucleotide sequence of the construct designated Arg426-Lys432 
(SEQ ID NO: 12). 

Figure 1 6 depicts the nucleotide sequence of the construct designated Asn425-Lys432 

30 (SEQ ID NO: 13). 

Figure 1 7 depicts the nucleotide sequence of the construct designated Ile424-Ala433 

(SEQ ID NO: 14). 
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Figure 18 depicts the nucleotide sequence of the construct designated Ile423-Met434 
(SEQIDNO:15). 

Figure 19 depicts the nucleotide sequence of the construct designated Gln422-Tyr435 
(SEQ ID NO: 16). 

5 Figure 20 depicts the nucleotide sequence of the construct designated Gln422- 

Tyr435B(SEQIDNO:17). 

Figure 21 depicts the nucleotide sequence of the construct designated Leul22- 
Serl99;Arg426-Gly431 (SEQ ID NO: 18). 

Figure 22 depicts the nucleotide sequence of the construct designated Leul22- 
10 Serl 99;Arg426-Lys432 (SEQ ID NO:19). 

Figure 23 depicts the nucleotide sequence of the construct designated Leul22-Serl99; 
Trp427-Gly43 1 (SEQ ID NO:20). 

Figure 24 depicts the nucleotide sequence of the construct designated Lysl21-Val200; 
Asn425-Lys432 (SEQ ID NO:21). 
15 Figure 25 depicts the nucleotide sequence of the construct designated VaI120-Ile201 ; 

Ile424-Ala433 (SEQ ID NO:22). 

Figure 26 depicts the nucleotide sequence of the construct designated Vall20- 
Ile201B* Ile424-Ala433 (SEQ ID NO:23). 

Figure 27 depicts the nucleotide sequence of the construct designated Vall20-Thr202; 
20 Ile424-Ala433 (SEQ ID NO:24). 

Figure 28 depicts the nucleotide sequence of the construct designated VaI127-Asnl95 
(SEQ ID NO:25). 

Figure 29 depicts the nucleotide sequence of the construct designated Vail 27- 
Asnl95; Arg426-Gly43 1 (SEQ ID NO:26). 

25 

Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of protein chemistry, viral immunobiology, molecular biology and 
recombinant DNA techniques within the skill of the art. Such techniques are explained fully 
30 in the literature. See, e.g., T.E. Creighton, Proteins: Structures and Molecular Properties 
(W.H. Freeman and Company, 1993); Nelson L.M and Jerome H.K. HIV Protocols in 
Methods in Molecular Medicine, vol. 17, 1999; Sambrook, et al., Molecular Cloning: A 
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Laboratory Manual (Cold Spring Harbor Laboratory, 1989); F.M. Ausubel et al. Current 
Protocols in Molecular Biology . Greene Publishing Associates & Wiley Interscience New 
York; and Lipkowitz and Boyd, Reviews in Computational Chemistry , volumes 1 -present 
(Wiley-VCH, New York, New York, 1 999). 
5 It must be noted that, as used in this specification and the appended claims, the 

singular forms "a", "an" and "the" include plural referents unless the content clearly dictates 
otherwise. Thus, for example, reference to M a polypeptide" includes a mixture of two or more 
polypeptides, and the like. 

10 Definitions 

In describing the present invention, the following terms will be employed, and are 

intended to be defined as indicated below. 

The terms "polypeptide," and "protein" are used interchangeably herein to denote any 

polymer of amino acid residues. The terms encompass peptides, oligopeptides, dimers, 
15 multimers, and the like. Such polypeptides can be derived from natural sources or can be 

synthesized or recombinantly produced. The terms also include postexpression modifications 

of the polypeptide, for example, glycosylation, acetylation, phosphorylation, etc. 

A polypeptide as defined herein is generally made up of the 20 natural amino acids 

Ala (A), Arg (R), Asn (N), Asp (D), Cys (C), Gin (Q), Glu (E), Gly (G), His (H), He (I), Leu 
20 (L), Lys (K), Met (M), Phe (F), Pro (P), Ser (S), Thr (T), Trp (W), Tyr (Y) and Val (V) and 

may also include any of the several known amino acid analogs, both naturally occurring and 

synthesized analogs, such as but not limited to homoisoleucine, asaleucine, 2- 

(methylenecyclopropyl)glycine, S-methylcysteine, S-(prop-l-enyl)cysteine, homoserine, 

ornithine, norleucine, norvaline, homoarginine, 3-(3-carboxyphenyl)alanine, 
25 cyclohexylalanine, mimosine, pipecolic acid, 4-methylglutamic acid, canavanine, 2,3- 

diaminopropionic acid, and the like. Further examples of polypeptide agents which will find 

use in the present invention are set forth below. 

By "geometry" or "tertiary structure" of a polypeptide or protein is meant the overall 

3-D configuration of the protein. As described herein, the geometry can be determined, for 
30 example, by crystallography studies or by using various programs or algorithms which 

predict the geometry based on interactions between the amino acids making up the primary 

and secondary structures. 
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By "wild type" polypeptide, polypeptide agent or polypeptide drug, is meant a 
naturally occurring polypeptide sequence, and its corresponding secondary structure. An 
"isolated" or "purified" protein or polypeptide is a protein which is separate and discrete from 
a whole organism with which the protein is normally associated in nature. It is apparent that 
5 the term denotes proteins of various levels of purity. Typically, a composition containing a 
purified protein will be one in which at. least about 35%, preferably at least about 40-50%, 
more preferably, at least about 75-85%, and most preferably at least about 90% or more, of 
the total protein in the composition will be the protein in question. 

By "Env polypeptide" is meant a molecule derived from an envelope protein, 

10 preferably from HIV Env. The envelope protein of HIV- 1 is a glycoprotein of about 1 60 kd 
(gpl 60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
form gpl20 and the integral membrane protein, gp41 . The gp41 portion is anchored in (and 
spans) the membrane bilayer of virion, while the gpl20 segment protrudes into the 
surrounding environment. As there is no covalent attachment between gpl20 and gp41, free 

15 gpl20 is released from the surface of virions and infected cells. Env polypeptides may also 
include gpl40 polypeptides. Env polypeptides can exist as monomers, dimers or multimers. 

By a "gpl20 polypeptide" is meant a molecule derived from a gp!20 region of the 
Env polypeptide. Preferably, the gpl20 polypeptide is derived from HIV Env. The primary 
amino acid sequence of gpl20 is approximately 511 amino acids, with a polypeptide core of 

20 about 60,000 daltons. The polypeptide is extensively modified by N-linked glycosylation to 
increase the apparent molecular weight of the molecule to 120,000 daltons. The amino acid 
sequence of gpl20 contains five relatively conserved domains interspersed with five 
hypervariable domains. The positions of the 1 8 cysteine residues in the gpl 20 primary 
sequence of the HIV-1 HXB . 2 (hereinafter "HXB-2") strain, and the positions of 13 of the 

25 approximately 24 N-Iinked glycosylation sites in the gpl 20 sequence are common to most, if 
not all, gpl 20 sequences. The hypervariable domains contain extensive amino acid 
substitutions, insertions and deletions. Despite this variation, most, if not all, gpl 20 
sequences preserve the virus's ability to bind to the viral receptor CD4. A "gpl20 
polypeptide" includes both single subunits or multimers. 

30 Env polypeptides (e.g., gpl20, gp!40 and gpl60) include a "bridging sheet" 

comprised of 4 anti-parallel P-strands (p-2, p-3, p-20 and p-2t) that form a p-sheet. 
Extruding from one pair of the p-strands (P-2 and P*3) are two loops, V 1 and V2. The p-2 
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sheet occurs at approximately amino acid residue 1 19 (Cys) to amino acid residue 123 (Thr) 
while P-3 occurs at approximately amino acid residue 199 (Ser) to amino acid residue 201 
(He), relative to HXB-2. The "V1/V2 region" occurs at approximately amino acid positions 
126 (Cys) to residue 196 (Cys), relative to HXB-2. (see, e.g., Wyatt et al. (1995)7. Virol 
5 69:5723-5733; Stamatatos et ah (1998) J. Virol 72:7840-7845). Extruding from the second 
pair of P-strands (P-20 and P-21) is a "small-loop" structure, also referred to herein as "the 
bridging sheet small loop." In HXB-2, P-20 extends from about amino acid residue 422 
(Gin) to amino acid residue 426 (Met) while P-21 extends from about amino acid residue 430 
(Val) to amino acid residue 435 (Tyr). In variant SF162, the Met-426 is an Arg (R) residue. 

10 The "small loop" extends from about amino acid residue 427 (Trp) through 429 (Lys), 

relative to HXB-2. A representative diagram of gpl20 showing the bridging sheet, the small 
loop, and VI N2 is shown in Figure 1 . In addition, alignment of the amino acid sequences of 
Env polypeptide gpl60 of selected variants is shown, relative to HXB-2, in Figures 2A-C. 
Furthermore, an "Env polypeptide" or M gpl20 polypeptide" as defined herein is not 

15 limited to a polypeptide having the exact sequence described herein. Indeed, the HIV 

genome is in a state of constant flux and contains several variable domains which exhibit 
relatively high degrees of variability between isolates. It is readily apparent that the terms 
encompass Env (e.g., gpl20) polypeptides from any of th$ identified HIV isolates, as well as 
newly identified isolates, and subtypes of these isolates. Descriptions of structural features 

20 are given herein with reference to HXB-2. One of ordinary skill in the art in view of the 

teachings of the present disclosure and the art can determine corresponding regions in other 
HIV variants (e.g.. isolates HIV IIIb , HIV SF2 , HIV-1 SF162 , HIV-i SF170 , HIV LAV , HIV LAl , HIV MN , 
HrV-l CM235 „ HIV-1 US4 , other HIV-1 strains from diverse subtypes(e.g., subtypes, A through 
G, and O), HIV-2 strains and diverse subtypes (e.g., HIV-2 UC1 and HIV-2 UC2 ), and simian 

25 immunodeficiency virus (SIV). (See, e.g., Virology, 3rd Edition (W.K. Joklik ed. 1988); 

Fundamental Virology, 2nd Edition (B.N. Fields and D.M. Knipe, eds. 1991); Virology, 3rd 
Edition (Fields, BN, DM Knipe, PM Howley, Editors, 1996, Lippincott-Raven, Philadelphia, 
PA; for a description of these and other related viruses), using for example, sequence 
comparison programs (e.g., BLAST and others described herein) or identification and 

30 alignment of structural features (e.g., a program such as the "ALB" program described herein 
that can identify P-sheet regions). The actual amino acid sequences of the modified Env 
polypeptides can be based on any HIV variant. 
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Additionally, the term "Env polypeptide" (e.g., "gpl20 polypeptide") encompasses 
proteins which include additional modifications to the native sequence, such as additional 
internal deletions, additions and substitutions. These modifications may be deliberate, as 
through site-directed mutagenesis, or may be accidental, such as through naturally occurring 
5 mutational events. Thus, for example, if the Env polypeptide is to be used in vaccine 

compositions, the modifications must be such that immunological activity (i.e., the ability to 
elicit an antibody response to the polypeptide) is not lost. Similarly, if the polypeptides are to 
be used for diagnostic purposes, such capability must be retained. 

Thus, a "modified Env polypeptide" is an Env polypeptide (e.g., gpl20 as defined 

1 0 above), which has been manipulated to delete or replace all or a part of the bridging sheet 
portion and, optionally, the variable regions VI and V2. Generally, modified Env (e.g. t 
gpl20) polypeptides have enough of the bridging sheet removed to expose the CD4 binding 
site, but leave enough of the structure to allow correct folding (e.g., correct geometry). Thus, 
modifications to the 0-20 and [3-21 regions (between about amino acid residues 420 and 435 

15 relative to HXB-2) are preferred. Additionally, modifications to the (J-2 and P-3 regions 
(between about amino acid residues 1 19 (Cys) and 201 (He)) and modifications (e.g., 
truncations) to the VI and V2 loop regions may also be made. Although not all possible 0- 
sheet and VI /V2 modifications have been exemplified herein, it is to be understood that other 
disrupting modifications are also encompassed by the present invention. 

20 Normally, such a modified polypeptide is capable of secretion into growth medium in 

which an organism expressing the protein is cultured. However, for purposes of the present 
invention, such polypeptides may also be recovered intracellularly. Secretion into growth 
media is readily determined using a number of detection techniques, including, e.g., 
polyacrylamide gel electrophoresis and the like, and immunological techniques such as 

25 Western blotting and immunoprecipitation assays as described in, e.g., International 
Publication No. WO 96/04301, published February 15, 1996. 

A gp!20 or other Env polypeptide is produced "intracellularly" when it is found 
within the cell, either associated with components of the cell, such as in association with the 
endoplasmic reticulum (ER) or the Golgi Apparatus, or when it is present in the soluble 

30 cellular fraction. The gp 120 and other Env polypeptides of the present invention may also be 
secreted into growth medium so long as sufficient amounts of the polypeptides remain 
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present within the cell such that they can be purified from cell lysates using techniques 
described herein. 

An "immunogenic" gpl20 or other Env protein is a molecule that includes at least one 
epitope such that the molecule is capable of either eliciting an immunological reaction in an 
5 individual to which the protein is administered or, in the diagnostic context, is capable of 
reacting with antibodies directed against the HIV in question. 

By "epitope" is meant a site on an antigen to which specific B cells and/or T cells 
respond, rendering the molecule including such an epitope capable of eliciting an 
immunological reaction or capable of reacting with HIV antibodies present in a biological 

10 sample. The term is also used interchangeably with "antigenic determinant" or "antigenic 
determinant site." An epitope can comprise 3 or more amino acids in a spatial conformation 
unique to the epitope. Generally, an epitope consists of at least 5 such amino acids and, more 
usually, consists of at least 8-10 such amino acids. Methods of determining spatial 
conformation of amino acids are known in the art and include, for example, x-ray 

1 5 crystallography and 2-dimensional nuclear magnetic resonance. Furthermore, the 

identification of epitopes in a given protein is readily accomplished using techniques well 
known in the art, such as by the use of hydrophobicity studies and by site-directed serology. 
See, also, Geysen et ah, Proc. Natl. Acad. Sci. USA (1984) 81:3998-4002 (general method of 
rapidly synthesizing peptides to determine the location of immunogenic epitopes in a given 

20 antigen); U.S. Patent No. 4,708,871 (procedures for identifying and chemically synthesizing 
epitopes of antigens); and Geysen et al, Molecular Immunology (1 986) 23:709-715 
(technique for identifying peptides with high affinity for a given antibody). Antibodies that 
recognize the same epitope can be identified in a simple immunoassay showing the ability of 
one antibody to block the binding of another antibody to a target antigen. 

25 An "immunological response" or "immune response" as used herein is the 

development in the subject of a humoral and/or a cellular immune response to the Env (e.g., 
gpl20) polypeptide when the polypeptide is present in a vaccine composition. These 
antibodies may also neutralize infectivity, and/or mediate antibody-complement or antibody 
dependent cell cytotoxicity to provide protection to an immunized host. Immunological 

30 reactivity may be determined in standard immunoassays, such as a competition assays, well 
known in the art. 
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Techniques for determining amino acid sequence "similarity" are well known in the 
art. In general, "similarity" means the exact amino acid to amino acid comparison of two or 
more polypeptides at the appropriate place, where amino acids are identical or possess similar 
chemical and/or physical properties such as charge or hydrophobicity. A so-termed "percent 
5 similarity" then can be determined between the compared polypeptide sequences. 

Techniques for determining nucleic acid and amino acid sequence identity also are well 
known in the art and include determining the nucleotide sequence of the mRNA for that gene 
(usually via a cDNA intermediate) and determining the amino acid sequence encoded 
thereby, and comparing this to a second amino acid sequence. In general, "identity" refers to 

1 0 an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two 
polynucleotides or polypeptide sequences, respectively. 

Two or more polynucleotide sequences can be compared by determining their 
"percent identity." Two or more amino acid sequences likewise can be compared by 
determining their "percent identity." The percent identity of two sequences, whether nucleic 

15 acid or peptide sequences, is generally described as the number of exact matches between two 
aligned sequences divided by the length of the shorter sequence and multiplied by 100. An 
approximate alignment for nucleic acid sequences is provided by the local homology 
algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981). 
This algorithm can be extended to use with peptide sequences using the scoring matrix 

20 developed by Dayhoff, Atlas of Protein Sequences and Structure, M.O. Dayhoff ed., 5 suppl. 
3:353-358, National Biomedical Research Foundation, Washington, D.C., USA, and 
normalized by Gribskov, Nucl. Acids Res. 14(6):6745-6763 (1986). An implementation of 
this algorithm for nucleic acid and peptide sequences is provided by the Genetics Computer 
Group (Madison, WI) in their BestFit utility application. The default parameters for this 

25 method are described in the Wisconsin Sequence Analysis Package Program Manual, Version 
8 (1995) (available from Genetics Computer Group, Madison, WI). Other equally suitable 
programs for calculating the percent identity or similarity between sequences are generally 
known in the art. 

For example, percent identity of a particular nucleotide sequence to a reference 
30 sequence can be determined using the homology algorithm of Smith and Waterman with a 
default scoring table and a gap penalty of six nucleotide positions. Another method of 
establishing percent identity in the context of the present invention is to use the MPSRCH 
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package of programs copyrighted by the University of Edinburgh, developed by John F. 
Collins and Shane S. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View, CA). 
From this suite of packages, the Smith- Waterman algorithm can be employed where default 
parameters are used for the scoring table (for example, gap open penalty of 12, gap extension 
5 penalty of one, and a gap of six). From the data generated, the "Match" value reflects 

"sequence identity." Other suitable programs for calculating the percent identity or similarity 
between sequences are generally known in the art, such as the alignment program BLAST, 
which can also be used with default parameters. For example, BLASTN and BLASTP can be 
used with the following default parameters: genetic code = standard; filter = none; strand = 

10 both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = 
HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank 
CDS translations + Swiss protein + Spupdate + PIR. Details of these programs can be found 
at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

One of skill in the art can readily determine the proper search parameters to use for a 

1 5 given sequence in the above programs. For example, the search parameters may vary based 
on the size of the sequence in question. Thus, for example, a representative embodiment of 
the present invention would include an isolated polynucleotide having X contiguous 
nucleotides, wherein (i) the X contiguous nucleotides have at least about 50% identity to Y 
contiguous nucleotides derived from any of the sequences described herein, (ii) X equals Y, 

20 and (iii) X is greater than or equal to 6 nucleotides and up to 5000 nucleotides, preferably 
greater than or equal to 8 nucleotides and up to 5000 nucleotides, more preferably 10-12 
nucleotides and up to 5000 nucleotides, and even more preferably 1 5-20 nucleotides, up to 
the number of nucleotides present in the full-length sequences described herein (e.g., see the 
Sequence Listing and claims), including all integer values falling within the above-described 

25 ranges. 

The synthetic expression cassettes (and purified polynucleotides) of the present 
invention include related polynucleotide sequences having about 80% to 100%, greater than 
80-85%, preferably greater than 90-92%, more preferably greater than 95%, and most 
preferably greater than 98% sequence (including all integer values falling within these 
30 described ranges) identity to the synthetic expression cassette sequences disclosed herein (for 
example, to the claimed sequences or other sequences of the present invention) when the 
sequences of the present invention are used as the query sequence. 
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Computer programs are also available to determine the likelihood of certain 
polypeptides to form structures such as p-sheets. One such program, described herein, is the 
"ALB" program for protein and polypeptide secondary structure calculation and predication. 
In addition, secondary protein structure can be predicted from the primary amino acid 
5 sequence, for example using protein crystal structure and aligning the protein sequence 
related to the crystal structure (e.g., using Molecular Operating Environment (MOE) 
programs available from the Chemical Computing Group Inc., Montreal, P.Q., Canada). 
Other methods of predicting secondary structures are described, for example, in Gamier et al. 
(1996) Methods Enzymol 266:540-553; Geourjon et al. (1995) Comput. Applic. BioscL 
10 11:681-684; Levin (1997) Protein Eng 10:771-776; and Rost et al. (1993)/. Molec. Biol. 
232:584-599. 

Homology can also be determined by hybridization of polynucleotides under 
conditions which form stable duplexes between homologous regions, followed by digestion 
with single-stranded-specific nuclease(s), and size determination of the digested fragments. 

1 5 Two DNA, or two polypeptide sequences are "substantially homologous" to each other when 
the sequences exhibit at least about 80%-85%, preferably at least about 90%, and most 
preferably at least about 95% : 98% sequence identity over a defined length of the molecules, 
as determined using the methods above. As used herein, substantially homologous also refers 
to sequences showing complete identity to the specified DNA or polypeptide sequence. DNA 

20 sequences that are substantially homologous can be identified in a Southern hybridization 
experiment under, for example, stringent conditions, as defined for that particular system. 
Defining appropriate hybridization conditions is within the skill of the art. See, e.g., 
Sambrook et al., supra; DNA Cloning, supra\ Nucleic Acid Hybridization, supra. 

A "coding sequence" or a sequence which "encodes" a selected protein, is a nucleic 

25 acid sequence which is transcribed (in the case of DNA) and translated (in the case of 

mRNA) into a polypeptide in vitro or in vivo when placed under the control of appropriate 
regulatory sequences. The boundaries of the coding sequence are determined by a start codon 
at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus. A coding 
sequence can include, but is not limited to cDNA from viral nucleotide sequences as well as 

30 synthetic and semisynthetic DNA sequences and sequences including base analogs. A 
transcription termination sequence may be located 3' to the coding sequence. 
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"Control elements" refers collectively to promoter sequences, ribosome binding sites, 
polyadenylation signals, transcription termination sequences, upstream regulatory domains, 
enhancers, and the like, which collectively provide for the transcription and translation of a 
coding sequence in a host cell. Not all of these control elements need always be present so 

5 long as the desired gene is capable of being transcribed and translated. 

A control element "directs the transcription" of a coding sequence in a cell when RNA 
polymerase will bind the promoter sequence and transcribe the coding sequence into mRNA, 
which is then translated into the polypeptide encoded by the coding sequence. 

"Operably linked" refers to an arrangement of elements wherein the components so 

10 described are configured so as to perform their usual function. Thus, control elements 

operably linked to a coding sequence are capable of effecting the expression of the coding 
sequence when RNA polymerase is present. The control elements need not be contiguous 
with the coding sequence, so long as they function to direct, the expression thereof. Thus, for 
example, intervening untranslated yet transcribed sequences can be present between, e.g., a 

1 5 promoter sequence and the coding sequence and the promoter sequence can still be 
considered "operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its 
origin or manipulation: (1) is not associated with all or a portion of the polynucleotide with 

20 which it is associated in nature; and/or (2) is linked to a polynucleotide other than that to 
which it is linked in nature. The term "recombinant" as used with respect to a protein or 
polypeptide means a polypeptide produced by expression of a recombinant polynucleotide. 
"Recombinant host cells," "host cells," "cells," "cell lines," "cell cultures," and other such 
terms denoting procaryotic microorganisms or eucaryotic cell lines cultured" as unicellular 

25 entities, are used interchangeably, and refer to cells which can be, or have been, used as 
recipients for recombinant vectors or other transfer DNA, and include the progeny of the 
original cell which has been transfected. It is understood that the progeny of a single parental 
cell may not necessarily be completely identical in morphology or in genomic or total DNA 
complement to the original parent, due to accidental or deliberate mutation. Progeny of the 

30 parental cell which are sufficiently similar to the parent to be characterized by the relevant 
property, such as the presence of a nucleotide sequence encoding a desired peptide, are 
included in the progeny intended by this definition, and are covered by the above terms. 
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By "vertebrate subject" is meant any member of the subphylum chordata, including, 
without limitation, humans and other primates, including non-human primates such as 
chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, 
goats and horses; domestic mammals such as dogs and cats; laboratory animals including 
5 rodents such as mice, rats and guinea pigs; birds, including domestic, wild and game birds 
such as chickens, turkeys and other gallinaceous birds, ducks, geese, and the like. The term 
does not denote a particular age. Thus, both adult and newborn individuals are intended to be 
covered. 

As used herein, a "biological sample 11 refers to a sample of tissue or fluid isolated 

10 from an individual, including but not limited to, for example, blood, plasma, serum, fecal 
matter, urine, bone marrow, bile, spinal fluid, lymph fluid, samples of the skin, external 
secretions of the skin, respiratory, intestinal, and genitourinary tracts, samples derived from 
the gastric epithelium and gastric mucosa, tears, saliva, milk, blood cells, organs, biopsies 
and also samples of in vitro cell culture constituents including but not limited to conditioned 

1 5 media resulting from the growth of cells and tissues in culture medium, e.g., recombinant 
cells, and cell components. 

The terms "label" and "detectable label" refer to a molecule capable of detection, 
including, but not limited to, radioactive isotopes, fluorescers, chemiluminescers, enzymes, 
enzyme substrates, enzyme cofactors, enzyme inhibitors, chromophores, dyes, metal ions, 

20 metal sols, ligands (e.g., biotin or haptens) and the like. The term "fluorescer" refers to a 
substance or a portion thereof which is capable of exhibiting fluorescence in the detectable 
range. Particular examples of labels which may be used with the invention include, but are 
not limited to fluorescein, rhodamine, dansyl, umbelliferone, Texas red, luminol, acradimum 
esters, NADPH, ct-P-galactosidase, horseradish peroxidase, glucose oxidase, alkaline 

25 phosphatase and urease. 

Overview 

The present invention concerns modified Env polypeptide molecules {e.g., 
glycoprotein ("gp") 120). Without being bound by a particular theory, it appears that it has 
30 been difficult to generate immunological responses against Env because the CD4 binding site 
is buried between the outer domain, the inner domain and the V1/V2 domains. Thus, 
although deletion of the V1/V2 domain may render the virus more susceptible to 
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neutralization by monoclonal antibody directed to the CD4 site, the bridging sheet covering 
most of the CD4 binding domain may prevent an antibody response. Thus, the present 
invention provides Env polypeptides that maintain their general overall structure yet expose 
the CD4 binding domain. This allows the generation of an immune response (e.g., an 
5 antibody response) to epitopes in or near the CD4 binding site. 

Various forms of the different embodiments of the invention, described herein, may 
be combined. 

(3-Sheet Conformations 

10 In the present invention, location of the (J-sheet structures were identified relative to 

3-D (crystal) structure of an HXB-2 crystallized Env protein (see, Example 1 A). Based on 
this structure, constructs encoding polypeptides having replacements and or excisions which 
maintain overall geometry while exposing the CD4 binding site were designed. In particular, 
the crystal structure of HXB-2 was downloaded from the Brookhaven Database. Using the 

1 5 default parameters of the Loop Search feature of the Biopolymer module of the Sybyl 

molecular modeling package, homology and fit of amino acids which could replace the native 
loops between p-strands yet maintain overall tertiary structure were determined. Constructs 
encoding the modified Env polypeptides were then designed (Example 1 .B.). 

Thus, the modified Env polypeptides typically have enough of the bridging sheet 

20 removed to expose the CD4 groove, but have enough of the structure to allow correct folding 
of the Env glycoprotein. Exemplary constructs are described below. 

Polypeptide Production 

The polypeptides of the present invention can be produced in any number of ways 
25 which are well known in the art. 

In one embodiment, the polypeptides are generated using recombinant techniques, 
well known in the art. In this regard, oligonucleotide probes can be devised based on the 
known sequences of the Env (e.g., gpl20) polypeptide genome and used to probe genomic or 
cDNA libraries for Env genes. The gene can then be further isolated using standard 
30 techniques and, e.g., restriction enzymes employed to truncate the gene at desired portions of 
the full-length sequence. Similarly, the Env gene(s) can be isolated directly from cells and 
tissues containing the same, using known techniques, such as phenol extraction and the 
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sequence further manipulated to produce the desired truncations. See, e.g., Sambrook et al., 
supra, for a description of techniques used to obtain and isolate DNA. 

The genes encoding the modified (e.g., truncated and/or substituted) polypeptides can 
be produced synthetically, based on the known sequences. The nucleotide sequence can be 
5 designed with the appropriate codons for the particular amino acid sequence desired. The 
complete sequence is generally assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding sequence. See, e.g., Edge (1981) 
Nature 292:756; Nambair et al. (1984) Science 223:1299; Jay et al. (1984) J. Biol Chem. 
259:6311: Stemmer et al. (1995) Gene 164:49-53. 

10 Recombinant techniques are readily used to clone a gene encoding an Env 

polypeptide gene which can then be mutagenized in vitro by the replacement of the 
appropriate base pair(s) to result in the codon for the desired amino acid. Such a change can 
include as little as one base pair, effecting a change in a single amino acid, or can encompass 
several base pair changes. Alternatively, the mutations can be effected using a mismatched 

1 5 primer which hybridizes to the parent nucleotide sequence (generally cDNA corresponding to 
the RNA sequence), at a temperature below the melting temperature of the mismatched 
duplex. The primer can be made specific by keeping primer length and base composition 
within relatively narrow limits and by keeping the mutant base centrally located. See, e.g., 
Innis et al, (1990) PCR Applications: Protocols for Functional Genomics; Zoller and Smith, 

20 Methods Enzymol (1983) 100:468. Primer extension is effected using DNA polymerase, the 
product cloned and clones containing the mutated DNA, derived by segregation of the primer 
extended strand, selected. Selection can be accomplished using the mutant primer as a 
hybridization probe. The technique is also applicable for generating multiple point 
mutations. See, e.g., Dalbie-McFarland et al. Proc. Natl Acad. Sci USA (1982) 79:6409. 

25 Once coding sequences for the desired proteins have been isolated or synthesized, 

they can be cloned into any suitable vector or replicon for expression. As will be apparent 
from the teachings herein, a wide variety of vectors encoding modified polypeptides can be 
generated by creating expression constructs which operably link, in various combinations, 
polynucleotides encoding Env polypeptides having deletions or mutation therein. Thus, 

30 polynucleotides encoding a particular deleted V1/V2 region can be operably linked with 
polynucleotides encoding polypeptides having deletions or replacements in the small loop 
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region and the construct introduced into a host cell for polypeptide expression. Non-limiting 
examples of such combinations are discussed in the Examples. 

Numerous cloning vectors are known to those of skill in the art, and the selection of 
an appropriate cloning vector is a matter of choice. Examples of recombinant DNA vectors 
5 for cloning and host cells which they can transform include the bacteriophage X (£. coli), 
pBR322 (E. coli), pACYC177 (E. coli), pKT230 (gram-negative bacteria), pGVl 106 
(gram-negative bacteria), pLAFRl (gram-negative bacteria), pME290 (non-£. coli 
gram-negative bacteria), pHV14 (E. coli and Bacillus subtilis), pBD9 {Bacillus), pIJ61 
(Streptomyces), pUC6 {Streptomyces), YIp5 {Saccharomyces), YCpl9 {Saccharomyces) and 

10 bovine papilloma virus (mammalian cells). See, generally, DNA Cloning: Vols. I & II, supra; 
Sambrook et al, supra] B. Perbal, supra. 

Insect cell expression systems, such as baculovirus systems, can also be used and are 
known to those of skill in the art and described in, e.g.. Summers and Smith, Texas 
Agricultural Experiment Station Bulletin No. 1555 (1987). Materials and methods for 

15 baculovirus/insect cell expression systems are commercially available in kit form from, inter 
alia, Invitrogen, San Diego CA ("MaxBac" kit). 

Plant expression systems can also be used to produce the modified Env proteins. 
Generally, such systems use virus-based vectors to transfect plant cells with heterologous 
genes. For a description of such systems see, e.g., Porta et al., Mol Biotech. (1996) 5:209- 

20 221; and Hackland et al., Arch. Virol (1994) 139:1-22. 

Viral systems, such as a vaccinia based infection/transfection system, as described in 
Tomei et al., J. Virol (1993) 67:4017-4026 and Selby et al., J. Gen. Virol. (1993) 
74:1 103-1 1 13, will also find use with the present invention. In this system, cells are first 
transfected in vitro with a vaccinia virus recombinant that encodes the bacteriophage T7 RNA 

25 polymerase. This polymerase displays exquisite specificity in that it only transcribes 

templates bearing T7 promoters. Following infection, cells are transfected with the DNA of 
interest, driven by a T7 promoter. The polymerase expressed in the cytoplasm from the 
vaccinia virus recombinant transcribes the transfected DNA into RNA which is then 
translated into protein by the host translational machinery. The method provides for high 

30 level, transient, cytoplasmic production of large quantities of RNA and its translation 
product(s). 
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The gene can be placed under the control of a promoter, ribosome binding site (for 
bacterial expression) and, optionally, an operator (collectively referred to herein as "control" 
elements), so that the DNA sequence encoding the desired Env polypeptide is transcribed into 
RNA in the host cell transformed by a vector containing this expression construction. The 
5 coding sequence may or may not contain a signal peptide or leader sequence. With the 

present invention, both the naturally occurring signal peptides or heterologous sequences can 
be used. Leader sequences can be removed by the host in post-translational processing. See, 
e.g., U.S. Patent Nos. 4,431,739; 4,425,437; 4,338,397. Such sequences include, but are not 
limited to, the TPA leader, as well as the honey bee mellitin signal sequence. 

10 Other regulatory sequences may also be desirable which allow for regulation of 

expression of the protein sequences relative to the growth of the host cell. Such regulatory 
sequences are known to those of skill in the art, and examples include those which cause the 
expression of a gene to be turned on or off in response to a chemical or physical stimulus, 
including the presence of a regulatory compound. Other types of regulatory elements may 

1 5 also be present in the vector, for example, enhancer sequences. 

The control sequences and other regulatory sequences may be ligated to the coding 
sequence prior to insertion into a vector. Alternatively, the coding sequence can be cloned 
directly into an expression vector which already contains the control sequences and an 
appropriate restriction site. 

20 In some cases it may be necessary to modify the coding sequence so that it may be 

attached to the control sequences with the appropriate orientation; i.e., to maintain the proper 
reading frame. Mutants or analogs may be prepared by the deletion of a portion of the 
sequence encoding the protein, by insertion of a sequence, and/or by substitution of one or 
more nucleotides within the sequence. Techniques for modifying nucleotide sequences, such 

25 as site-directed mutagenesis, are well known to those skilled in the art. See, e.g., Sambrook 
et al, supra; DNA Cloning, Vols. I and II, supra; Nucleic Acid Hybridization, supra. 

The expression vector is then used to transform an appropriate host cell. A number of 
mammalian cell lines are known in the art and include immortalized cell lines available from 
the American Type Culture Collection (ATCC), such as, but not limited to, Chinese hamster 

30 ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells 

(COS), human hepatocellular carcinoma cells (e.g., Hep G2), Vero293 cells, as well as others. 
Similarly, bacterial hosts such as E. colt, Bacillus subtilis, and Streptococcus spp., will find 
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use with the present expression constructs. Yeast hosts useful in the present invention 
include inter alia, Saccharomyces cerevisiae, Candida albicans, Candida maltosa, 
Hansenula polymorpha, Kluyveromyces fragilis, Kluyveromyces lactis, Pichia guillehmondii, 
Pichia pastoris, Schizosaccharomyces pombe and Yarroma lipolytica. Insect cells for use 
5 with baculovirus expression vectors include, inter alia, Aedes aegypti, Autographa 
californica, Bombyx rnori, Drosophila melanogaster, Spodoptera frugiperda, and 
Trichoplnsia nu 

Depending on the expression system and host selected, the proteins of the present 
invention are produced by growing host cells transformed by an expression vector described 

1 0 above under conditions whereby the protein of interest is expressed. The selection of the 
appropriate growth conditions is within the skill of the art. 

In one embodiment, the transformed cells secrete the polypeptide product into the 
surrounding media. Certain regulatory sequences can be included in the vector to enhance 
secretion of the protein product, for example using a tissue plasminogen activator (TP A) 

15 leader sequence, a y-interferon signal sequence or other signal peptide sequences from known 
secretory proteins. The secreted polypeptide product can then be isolated by various 
techniques described herein, for example, using standard purification techniques such as but 
not limited to, hydroxyapatite resins, column chromatography, ion-exchange 
chromatography, size-exclusion chromatography, electrophoresis, HPLC, immunoadsorbent 

20 techniques, affinity chromatography, immunoprecipitation, and the like.. 

Alternatively, the transformed cells are disrupted, using chemical, physical or 
mechanical means, which lyse the cells yet keep the Env polypeptides substantially intact. 
Intracellular proteins can also be obtained by removing components from the cell wall or 
membrane, e.g., by the use of detergents or organic solvents, such that leakage of the Env 

25 polypeptides occurs. Such methods are known to those of skill in the art and are described in, 
e.g., Protein Purification Applications: A Practical Approach, (E.L.V. Harris and S. Angal, 
Eds., 1990) 

For example, methods of disrupting cells for use with the present invention include 
but are not limited to: sonication or ultrasonication; agitation; liquid or solid extrusion; heat 
30 treatment; freeze-thaw; desiccation; explosive decompression; osmotic shock; treatment with 
lytic enzymes including proteases such as trypsin, neuraminidase and lysozyme; alkali 
treatment; and the use of detergents and solvents such as bile salts, sodium dodecylsulphate, 
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Triton, NP40 and CHAPS. The particular technique used to disrupt the cells is largely a 
matter of choice and will depend on the cell type in which the polypeptide is expressed, 
culture conditions and any pre-treatment used. 

Following disruption of the cells, cellular debris is removed, generally by 
5 centrifugation, and the intracellularly produced Env polypeptides are further purified, using 
standard purification techniques such as but not limited to, column chromatography, ion- 
exchange chromatography, size-exclusion chromatography, electrophoresis, HPLC, 
immunoadsorbent techniques, affinity chromatography, immunoprecipitation, and the like. 
For example, one method for obtaining the intracellular Env polypeptides of the 

1 0 present invention involves affinity purification, such as by immunoaffinity chromatography 
using anti-Env specific antibodies, or by lectin affinity chromatography. Particularly 
preferred lectin resins are those that recognize mannose moieties such as but not limited to 
resins derived from Galanthus nivalis agglutinin (GNA), Lens culinaris agglutinin (LCA or 
lentil lectin), Pisum sativum agglutinin (PSA or pea lectin), Narcissus pseudonarcissus 

1 5 agglutinin (NPA) and Allium ursinum agglutinin (AUA). The choice of a suitable affinity 
resin is within the skill in the art. After affinity purification, the Env polypeptides can be 
further purified using conventional techniques well known in the art, such as by any of the 
techniques described above. 

It may be desirable to produce Env (e.g., gpl20) complexes, either with itself or other 

20 proteins. Such complexes are readily produced by e.g., co-transfecting host cells with 
constructs encoding for the Env (e.g., gpl20) and/or other polypeptides of the desired 
complex. Co-transfection can be accomplished either in trans or cis, i.e., by using separate 
vectors or by using a single vector which bears both of the Env and other gene. If done using 
a single vector, both genes can be driven by a single set of control elements or, alternatively, 

25 the genes can be present on the vector in individual expression cassettes, driven by individual 
control elements. Following expression, the proteins will spontaneously associate. 
Alternatively, the complexes can be formed by mixing the individual proteins together which 
have been produced separately, either in purified or semi-purified form, or even by mixing 
culture media in which host cells expressing the proteins, have been cultured. See, 

30 International Publication No. WO 96/04301, published February 15, 1996, for a description 
of such complexes. 
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Relatively small polypeptides, i.e., up to about 50 amino acids in length, can be 
conveniently synthesized chemically, for example by any of several techniques that are 
known to those skilled in the peptide art. In general, these methods employ the sequential 
addition of one or more amino acids to a growing peptide chain. Normally, either the amino 
5 or carboxyl group of the first amino acid is protected by a suitable protecting group. The 
protected or derivatized amino acid can then be either attached to an inert solid support or 
utilized in solution by adding the next amino acid in the sequence having the complementary 
(amino or carboxyl) group suitably protected, under conditions that allow for the formation of 
an amide linkage. The protecting group is then removed from the newly added amino acid 

10 residue and the next amino acid (suitably protected) is then added, and so forth. After the 
desired amino acids have been linked in the proper sequence, any remaining protecting 
groups (and any solid support, if solid phase synthesis techniques are used) are removed 
sequentially or concurrently, to render the final polypeptide. By simple modification of this 
general procedure, it is possible to add more than one amino acid at a time to a growing 

1 5 chain, for example, by coupling (under conditions which do not racemize chiral centers) a 
protected tripeptide with a properly protected dipeptide to form, after deprotection, a 
pentapeptide. See, e.g., J. M. Stewart and J. D. Young, Solid Phase Peptide Synthesis 
(Pierce Chemical Co., Rockford, IL 1984) and G. Barany and R. B. Merrifield, The Peptides; 
Analysis. Synthesis. Biology , editors E. Gross and J. Meienhofer, Vol. 2, (Academic Press, 

20 New York, 1 980), pp. 3-254, for solid phase peptide synthesis techniques; and M. Bodansky, 
Principles of Peptide Synthesis . (Springer- Verlag, Berlin 1984) and E. Gross and J. 
Meienhofer, Eds., The Peptides: Analysis. Synthesis. Biology . Vol. 1, for classical solution 
synthesis. 

Typical protecting groups include t-butyloxycarbonyl (Boc), 9- 
25 fluorenylmethoxycarbonyl (Fmoc) benzyloxycarbonyl (Cbz); p-toluenesulfonyl (Tx); 2,4- 
dinitrophenyl; benzyl (Bzl); biphenylisopropyloxycarboxy-carbonyl, t- 
amyloxycarbonyl, isobornyloxycarbonyl, o-bromobenzyloxycarbonyl, cyclohexyl, isopropyl, 
acetyl, o-nitrophenylsulfonyl and the like. 

Typical solid supports are cross-linked polymeric supports. These can include 
30 divinylbenzene cross-linked-styrene-based polymers, for example, divinylbenzene- 

hydroxymethylstyrene copolymers, divinylbenzene-chloromethylstyrene copolymers and 
divinylbenzene-benzhydrylaminopolystyrene copolymers. 
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The polypeptide analogs of the present invention can also be chemically prepared by 
other methods such as by the method of simultaneous multiple peptide synthesis. See, e.g., 
Houghten Proc. Natl Acad. ScL USA (1985) 82:5131-5135; U.S. Patent No. 4,631,21 1. 



5 Diagnostic and Vaccine Applications 

The intracellular^ produced Env polypeptides of the present invention, complexes 
thereof, or the polynucleotides coding therefor, can be used for a number of diagnostic and 
therapeutic purposes. For example, the proteins and polynucleotides or antibodies generated 
against the same, can be used in a variety of assays, to determine the presence of reactive 

10 antibodies/and or Env proteins in a biological sample to aid in the diagnosis of HIV infection 
or disease status or as measure of response to immunization. 

The presence of antibodies reactive with the Env {e.g., gpl20) polypeptides and, 
conversely, antigens reactive with antibodies generated thereto, can be detected using 
standard electrophoretic and immunodiagnostic techniques, including immunoassays such as 

1 5 competition, direct reaction, or sandwich type assays. Such assays include, but are not 

limited to, western blots; agglutination tests; enzyme-labeled and mediated immunoassays, 
such as ELIS As; biotin/avidin type assays; radioimmunoassays; immunoelectrophoresis; 
immunoprecipitation, etc. The reactions generally include revealing labels such as 
fluorescent, chemiluminescent, radioactive, or enzymatic labels or dye molecules, or other 

20 methods for detecting the formation of a complex between the antigen and the antibody or 
antibodies reacted therewith. 

Solid supports can be used in the assays such as nitrocellulose, in membrane or 
microtiter well form; polyvinylchloride, in sheets or microtiter wells; polystyrene latex, in 
beads or microtiter plates; polyvinylidine fluoride; diazotized paper; nylon membranes; 

25 activated beads, and the like. 

Typically, the solid support is first reacted with the biological sample (or the gpl20 
proteins), washed and then the antibodies, (or a sample suspected of containing antibodies), 
applied. After washing to remove any non-bound ligand, a secondary binder moiety is added 
under suitable binding conditions, such that the secondary binder is capable of associating 

30 selectively with the bound ligand. The presence of the secondary binder can then be detected 
using techniques well known in the art. Typically, the secondary binder will comprise an 
antibody directed against the antibody ligands. A number of anti-human immunoglobulin 
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(Ig) molecules are known in the art (e.g., commercially available goat anti-human Ig or rabbit 
anti-human Ig). Ig molecules for use herein will preferably be of the IgG or IgA type, 
however, IgM may also be appropriate in some instances. The Ig molecules can be readily 
conjugated to a detectable enzyme label, such as horseradish peroxidase, glucose oxidase, 
5 Beta-galactosidase, alkaline phosphatase and urease, among others, using methods known to 
those of skill in the art. An appropriate enzyme substrate is then used to generate a detectable 
signal. 

Alternatively, a "two antibody sandwich" assay can be used to detect the proteins of 
the present invention. In this technique, the solid support is reacted first with one or more of 

10 the antibodies directed against Env (e.g., gpl20), washed and then exposed to the test sample. 
Antibodies are again added and the reaction visualized using either a direct color reaction or 
using a labeled second antibody, such as an anti-immunoglobulin labeled with horseradish 
peroxidase, alkaline phosphatase or urease. 

Assays can also be conducted in solution, such that the viral proteins and antibodies 

15 thereto form complexes under precipitating conditions. The precipitated complexes can then 
be separated from the test sample, for example, by centrifugation. The reaction mixture can 
be analyzed to determine the presence or absence of antibody-antigen complexes using any of 
a number of standard methods, such as those immunodiagnostic methods described above. 
The modified Env proteins, produced as described above, or antibodies to the 

20 proteins, can be provided in kits, with suitable instructions and other necessary reagents, in 
order to conduct immunoassays as described above. The kit can also contain, depending on 
the particular immunoassay used, suitable labels and other packaged reagents and materials 
(i.e. wash buffers and the like). Standard immunoassays, such as those described above, can 
be conducted using these kits. 

25 The Env polypeptides and polynucleotides encoding the polypeptides can also be used 

in vaccine compositions, individually or in combination, in e.g., prophylactic (i.e., to prevent 
infection) or therapeutic (to treat HIV following infection) vaccines. The vaccines can 
comprise mixtures of one or more of the modified Env proteins (or nucleotide sequences 
encoding the proteins), such as Env (e.g.. gpl20) proteins derived from more than one viral 

30 isolate. The vaccine may also be administered in conjunction with other antigens and 
immunoregulatory agents, for example, immunoglobulins, cytokines, lymphokines, and 
chemokines, including but not limited to IL-2, modified IL-2 (cysl25-serl25), GM-CSF, IL- 
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12, Y-interferon, IP- 10, MIPip and RANTES. The vaccines may be administered as 
polypeptides or, alternatively, as naked nucleic acid vaccines (e.g., DNA), using viral vectors 
(e.g., retroviral vectors, adenoviral vectors, adeno-associated viral vectors) or non-viral 
vectors (e.g.. liposomes, particles coated with nucleic acid or protein). The vaccines may also 
5 comprise a mixture of protein and nucleic acid, which in turn may be delivered using the 
same or different vehicles. The vaccine may be given more than once (e.g., a "prime" 
administration followed by one or more "boosts") to achieve the desired effects. The same 
composition can be administered as the prime and as the one or more boosts. Alternatively, 
different compositions can be used for priming and boosting. 

10 The vaccines will generally include one or more "pharmaceutically acceptable 

excipients or vehicles" such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary 
substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may 
be present in such vehicles. 

A carrier is optionally present which is a molecule that does not itself induce the 

15 production of antibodies harmful to the individual receiving the composition. Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycollic acids, polymeric amino acids, amino acid 
copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles. 
Such carriers are well known to those of ordinary skill in the art. Furthermore, the Env 

20 polypeptide may be conjugated to a bacterial toxoid, such as toxoid from diphtheria, tetanus, 
cholera, etc. 

Adjuvants may also be used to enhance the effectiveness of the vaccines. Such 
adjuvants include, but are not limited to: (1) aluminum salts (alum), such as aluminum 
hydroxide, aluminum phosphate, aluminum sulfate, etc.; (2) oil-in-water emulsion 

25 formulations (with or without other specific immunostimulating agents such as muramyl 
peptides (see below) or bacterial cell wall components), such as for example (a) MF59 
(International Publication No. WO 90/14837), containing 5% Squalene, 0.5% Tween 80, and 
0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not 
required) formulated into submicron particles using a microfluidizer such as Model HOY 

30 microfluidizer (Microfluidics, Newton, MA), (b) SAF, containing 10% Squalane, 0.4% 
Tween 80, 5% pluronic-blocked polymer LI 21, and thr-MDP (see below) either 
microfluidized into a submicron emulsion or vortexed to generate a larger particle size 



28 



WO 00/39303 PCT/US99/31272 

emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) 
containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components 
from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), 
and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, 
5 such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particle 
generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete 
Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant (IF A); (5) cytokines, such as 
interleukins (IL-1, IL-2, etc.), macrophage colony stimulating factor (M-CSF), tumor necrosis 
factor (TNF), etc.; (6) detoxified mutants of a bacterial ADP-ribosylating toxin such as a 

10 cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), particularly 
LT-K63 (where lysine is substituted for the wild-type amino acid at position 63) LT-R72 
(where arginine is substituted for the wild-type amino acid at position 72), CT-S109 (where 
serine is substituted for the wild-type amino acid at position 109), and PT-K9/G129 (where 
lysine is substituted for the wild-type amino acid at position 9 and glycine substituted at 

15 position 129) (see, e.g., International Publication Nos. W093/13202 and W092/19265); and 
(7) other substances that act as immunostimulating agents to enhance the effectiveness of the 
composition. 

Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-MDP), N-acteyl-normuramyl-L-alanyl-D-isogluatme (nor-MDP), N- 

20 acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine-2-(r-2 , -dipalmitoyl-jn-glycero-3- 
huydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

Typically, the vaccine compositions are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles 
prior to injection may also be prepared. The preparation also may be emulsified or 

25 encapsulated in liposomes for enhanced adjuvant effect, as discussed above. 

The vaccines will comprise a therapeutically effective amount of the modified Env 
proteins, or complexes of the proteins, or nucleotide sequences encoding the same, and any 
other of the above-mentioned components, as needed. By "therapeutically effective amount" 
is meant an amount of a modified Env {e.g., gpl20) protein which will induce a protective 

30 immunological response in the uninfected, infected or unexposed individual to which it is 
administered. Such a response will generally result in the development in the subject of a 
secretory, cellular and/or antibody-mediated immune response to the vaccine. Usually, such 
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a response includes but is not limited to one or more of the following effects; the production 
of antibodies from any of the immunological classes, such as immunoglobulins A, D, E, G or 
M; the proliferation of B and T lymphocytes; the provision of activation, growth and 
differentiation signals to immunological cells; expansion of helper T cell, suppressor T cell, 
5 and/or cytotoxic T cell. 

Preferably, the effective amount is sufficient to bring about treatment or prevention of 
disease symptoms. The exact amount necessary will vary depending on the subject being 
treated; the age and general condition of the individual to be treated; the capacity of the 
individual's immune system to synthesize antibodies; the degree of protection desired; the 

10 severity of the condition being treated; the particular Env polypeptide selected and its mode 
of administration, among other factors. An appropriate effective amount can be readily 
determined by one of skill in the art. A "therapeutically effective amount" will fall in a 
relatively broad range that can be determined through routine trials. 

Once formulated, the nucleic acid vaccines may be accomplished with or without viral 

1 5 vectors, as described above, by injection using either a conventional syringe or a gene gun, 
such as the Accell® gene delivery system (PowderJect Technologies, Inc., Oxford, England). 
Delivery of DNA into cells of the epidermis is particularly preferred as this mode of 
administration provides access to skin-associated lymphoid cells and provides for a transient 
presence of DNA in the recipient. Both nucleic acids and/or peptides can be injected either 

20 subcutaneously, epidermally, intradermally, intramucosally such as nasally, rectally and 
vaginally, intraperitoneally, intravenously, orally or intramuscularly. Other modes of 
administration include oral and pulmonary administration, suppositories, needle-less 
injection, transcutaneous and transdermal applications. Dosage treatment may be a single 
dose schedule or a multiple dose schedule. Administration of nucleic acids may also be 

25 combined with administration of peptides or other substances. 

While the invention has been described in conjunction with the preferred specific 
embodiments thereof, it is to be understood that the foregoing description as well as the 
examples which follow are intended to illustrate and not limit the scope of the invention. 
Other aspects, advantages and modifications within the scope of the invention will be 

30 apparent to those skilled in the art to which the invention pertains. 
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Experimental 

Below are examples of specific embodiments for carrying out the present invention. 
The examples are offered for illustrative purposes only, and are not intended to limit the 
scope of the present invention in any way. 
5 Efforts have been made to ensure accuracy with respect to numbers used (e.g., 

amounts, temperatures, etc.), but some experimental error and deviation should, of course, be 
allowed for. 

Example 1 

10 A. 1 . Best-Fit and Homology Searches 

The crystal structure of HXB-2 gp 120 was downloaded from the Brookhaven 
database (COMPLEX (HIV ENVELOPE PROTEIN/CD4/FAB) 15-JUN-98 1GC1 
TITLE: HIV-1 GP120 CORE COMPLEXED WITH CD4 AND A NEUTRALIZING 
HUMAN ANTIBODY). Beta strands 3, 2, 21, and 20 of gp 120 form a sheet near the CD4 

15 binding site. Strands P-3 and p-2 are connected by the V1/V2 loop. Strands p-21 and p-20 
are connected by another small loop. The H-bonds at the interface between strands P-2 and 
p-21 are the only connection between domains of the "lower" half of the protein (joining 
helix alpha 1 to the CD4 binding site). This beta sheet and these loops mask some antigens 
(e.g., antigens which may generate neutralizing antibodies) that are only exposed during the 

20 CD4 binding. 

Constructs that remove enough of the beta sheet to expose the antigens in the CD4 
binding site, but leave enough of the protein to allow correct folding were designed. 
Specifically targeted were modifications to the small loop and, optional deletion of the V1/V2 
loops. Three different types of constructs were designed: (1) constructs encoding 

25 polypeptides that leave the number of residues making up the entire 4-strand beta sheet intact, 
but replace one or more residues; (2) constructs that encode polypeptide having at least one 
residue of at least one beta strand excised or (3) constructs encoding polypeptides having at 
least two residues of at least one beta strand excised. Thus, a total of 6 different turns were 
needed to rejoin the ends of the strands. 

30 Initially, residues in the small loop (residues 427-430, relative to HXB-2) and 

connected beta strands (P-20 and P-21) were modified to contain Gly and Pro (common in 
beta turns). These sequences were then used as the target to match in each search. The 



31 



WO 00/39303 PCIYUS99/31272 

geometry of the target was matched to known proteins in the Brookhaven Protein Data Bank. 
In particular, 5-residue turns (including an overlapping single residue at the N-terminal, the 2 
residue target turn and 2 overlapping residues at the C-terminal) were searched in the 
databases. In other words, these modified loops add a 2 residue turn that should be able to 
5 support a geometry that will maintain the beta-sheet structure of the wild type protein. The 
calculations were performed using the default parameters in the Loop Search feature of the 
Biopolymer module of the Sybyl molecular modeling package. In each case, the 25 best fits 
based on geometry alone were reviewed and, of those, several selected for homology and fit. 
In addition, it was also determined what modifications could be made to remove most 

10 of the V 1 /V2 loop (residues 1 24- 1 98, relative to HXB-2) yet leave the geometry of the 

protein intact. As with the small loop, constructs were also designed which excised one or 
more residues from the P-2 strand (residues 1 19-123 of HXB-2), the £-3 strand (residues 199- 
201 of HXB-2) or both P-2 and p-3. For these constructs, known loops were searched to 
match the geometry of a pentamer (including two remaining residues from the N-terminal 

15 side, a 2 residue turn and 1 C-teiminal residue). For these searches, Gly-Gly was preferred as 
the insert along with at least one C-terminal substitution. 

A.2. Small Loop Replacements 

In one aspect, the native sequence was replaced with residues that expose the CD4 
20 binding site, but leave the overall geometry of the protein relatively unchanged. For the 
small loop replacements, the target to match was: ASN425-MET426-GLY427-GLY428- 
GLY43 1 . Results of the search are summarized in Table 1 . 

Table 1: Search of Small Loop (Asn425 through Gly431) 

25 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit 


LYS-ASP-SER-ASN-ASN 


0.16689 


62.5 


27 


3 


TYR-GLY-LEU-GLY-LEU 


0.220308 


62.5 


28 


4 


GLU-ARG-GLU-ASP-GLY 


0.241754 


62.5 


29 


7 


ARG-LYS-GLY-GLY-ASN 


0.24881 


100 


30 


12 


TRP-THR-GLY-SER-TYR 


0.26417 


83.33 


31 
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Based on these results, constructs encoding Gly-Gly (#7), Gly-Ser (#12) or Gly-GIy- 
Asn (#7) were recommended. 

As VI /V2 and one or more residues of p-2 and p-3 are also optionally deleted in the 
modified polypeptides of the invention, known loops to match the geometry of the V1/V2 
5 loop were also searched. The V1/V2 loop the target to match was: Lysl21-Leu-122-GIyl23- 
Gly 124-Serl 99. Some notable matches are shown in Table 2: 



Table 2: Search of V1/V2 loop (Lysl21 through Serl99) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id. No. 


Best fit 


GLN-VAL-HIS-ASP-GLU 


0.154764 


68.75 


32 


2 


LYS-GLU-GLY-ASP-LYS 


0.15718 


81.25 


33 


9 


ARG-SER-GLY-ARG-SER 


0.173731 


68.75 


34 


11 


THR-LEU-GLY-ASN-SER 


0.175554 


81.25 


35 


16 


HIS-PHE-GLY-ALA-GLY 


0.178772 


93.75 


36 



15 : ■ ■ 

Based on these searches, constructs encoding Gly-Asn in place of V 1/V2 were 
recommended. 

A.3. One Additional Residue Excisions 
20 For a slightly truncated small loop, one more residue was trimmed from each beta 

strand to slightly shorten the beta sheet. The target to match was: ILE424-ASN425- 
GLY426-GLY427-LYS432. Results are shown in Table 3: 



Table 3: Search of Beta sheet shortened by One residue (Ile424 through Lys432) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit: 


ARG-MET-ALA-PRO-VAL 


0.316805 


58.33 


37 


Best 
hom: 


ASP-SER-ASP-GLY-PRO 


0.440896 


83.33 


38 
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Although these searches showed more variation and worse fits than the previous 
truncation, the Pro-Val or Pro-Leu encoding constructs were very similar. Accordingly, Ala- 
Pro encoding constructs were recommended. 

Sequences encoding gp!20 polypeptides having V1/V2 deleted and an additional 
5 residue from (J-2 or P~3 excised were also searched. The V1/V2 loop the target to match was: 
VAL120-LYS121-GLY122-GLY123-VAL200. Some notable matches are shown in Table 4. 



Table 4: Search of V1/V2 loop (Vall20 through Val200) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-VAL-ASP-PRO-TYR 


0.400892 


58.33333 


39 


2 


SER-THR-ASN-PRO-LEU 


0.402575 


54.16667 


40 


3 


THR-ARG-SER-PRO-LEU 


0.403965 


58.33333 


41 


7 


ARG-MET-ALA-PRO-VAL 


0.440118 


58.33333 


42 



15 

The construct encoding Ala-Pro {e.g., #7) was recommended. 
A.4. Further Excisions 

In yet another truncation, an additional residue was trimmed from the p-20 and P-21 
20 strands to further shorten the beta sheet. The target to match was ILE423-ILE424-GLY425- 
GLY426-ALA433. Notable matches are shown in Table 5. 



Table 5: Search of Beta sheet shortened by Two Residues (Ile423 through Ala433) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-TYR-GLU-GLY-VAL 


0.130107 


79.16666 


43 


2 


GLN-VAL-GLY-ASN-THR 


0.138245 


79.16666 


44 


3: 


THR-VAL-GLY-GLY-ILE 


0.153362 


100 


45 



A construct encoding Gly-Gly (e.g., #3), which has 100% homology, was 
30 recommended. 
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Also searched were sequences encoding a deleted VI /V2 region and at least two 
residues excised from P-2, P-3 or at least one residue excised from P-2 and P-3. The target to 
match was: C YS 1 1 9-VAL 120-GLY 1 2 1 -GL Y 1 22-ILE20 1 . Notable matches are shown in 
Table 6. 

5 

Table 6: Search of V1/V2 loop (Cysl 19 through Ile201) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


ASP-LEU-PRO-GLY-CYS 


0.250501 


75 


46 


4 


ASP-VAL-GLY-GLY-LEU 


0.290383 


100 


47 



10 

It was determined that both constructs would be used. 
B. 1 . Constructs encoding modified Env polypeptides 

As described above, the native loops extruding from the 4-P antiparallel-stands were 
1 5 excised and replaced with 1 to 3 residue turns. The loops were replaced so as to leave the 
entire p-strands or excised by trimming one or more amino acid from each side of the 
connected strands. The ends of the strands were rejoined with 

turns that preserve the same backbone, geometry (e.g., tertiary structure of p-20 and P-21), as 
determined by searching the Brookhaven Protein Data Bank. 
20 Table 7 A is a summary of the truncations of the variable regions 1 and 2 

recommended for this study, as determined in Example 1 .A. above. 
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V1/V2 Modifications 


SEQ ID NO 


Figure 


-LEU 1 22-GLY-ASN-SER1 99 


7 


10 


-LYS 1 2 1 - ALA-PRO- VAL200- 


6 


9 


- V AL 1 20-GL Y-GLY-ILE20 1 - 


4 


7 


-VAL1 20-PRO-GLY-ILE201B- 


5 


8 


-VAL1 20-GLY-ALA-GLY-ALA204- 


3 


6 


- VAL 1 20-GLY-GLY-ALA-THR202- 


8 


11 


- V AL 1 27-GLY- ALA-GLY-ASN 1 95- 


25 


28 



10 

As previously noted, the polypeptides encoded by the constructs of the present 
invention are numbered relative to HXB-2, but the particular amino acid residue of the 
polypeptides encoded by these exemplary constructs is based on SF-162. Thus, for example, 
although amino acid residue 195 in HXB-2 is a serine (S), constructs encoding polypeptides 
15 having then wild type SF162 sequence will have an asparagine (N) at this position. Table 7B 
shows just three of the variations in amino acid sequence between strains HXB-2 and SF162. 
The entire sequences, including differences in residue and amino acid number, of HXB-2 and 
SF162 are shown in the alignment of Figure 2 (SEQ ID NOs:l and 2). 

20 Table 7B 



HXB-2 amino 
acid number 


HXB-2 Residue 


SF162 Residue/amino acid number 


128 


Serine (S) 


Thr(T)/114 


195 


Serine (S) 


Asn(N)/188 


426 


Met(M) 


Arg (R)/411 



Constructs containing deletions in the P-20 strand, P-21 stand and small loop were 
also constructed. Shown in Table 8 are constructs encoding truncations in these regions. The 
constructs in Table 8 are numbered relative to HXB-2 but the unmodified amino acid 
30 sequence is based on SF162. Thus, the construct encodes an arginine (Arg) as is found in 
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SF162 in the amino acid position numbered 426 relative to HXB-2 (See, also, Table 7B). 
Changes from wildtype (SF162) are shown in bold in Table 8B. 



Table 8 



Small Loop/p-20 and P-21 (Modified) 


SEQ ID NO 


Figure 


-TRP427-GLY-GLY431- 


9 


12 


-/vKvJHZO-VjLr X -VjL< I -VjJL X *t J 1 - 


in 




-ARG426-GLY-SER-GLY43 1 B- 


11 


14 


- ARG426-GL Y-GL Y-ASN-L YS43 2- 


12 


15 


-ASN425-ALA-PRO-LYS432- 


13 


16 


-ILE424-GLY-GLY-ALA433- 


14 


17 


-ILE423-GLY-GLY-MET434- 


15 


18 


GLN422-GLY-GLY-TYR43 5- 


16 


19 


-GLN422-ALA-PRO-TYR435B- 


17 


20 



15 

The deletion constructs shown in Tables 7 and 8 for each one of the P-strands and 
combinations of them are constructed. These deletions will be tested in the Env forms gpl20, 
gpl40 and gpl60 from different HIV strains like subtype B strains (e.g., SF162, US4, SF2), 
subtype E strains (e.g., CM235) and subtype C strains (e.g., AF1 10968 or AF1 10975). 
20 Exemplary constructs for SF162 are shown in the 

Figures and are summarized in Table 9. As noted above in Figure 2 and Table 7B, in the 
bridging sheet region, the amino acid sequence of SF162 differs from HXB-2 in that the 
Met426 of HXB-2 is an Arg in SF162. In Table 9, V1/V2 refers to deletions in the VW2 
region; # bsm refers to a modification in the bridging sheet small loop. 

25 



Table 9 


Construct 


Seq. Id. 


Fig- 


Modification/Amino acid sequence 


Vall20-Ala204 


3 


6 


V1/V2: Va!120-Gly-AIa-Gly-AIa204 


VaiI20-IIe201 


4 


7 


V1/V2: Vall20-GIy-GIy-Ile201 


VaI120-lle201B 


5 


S 


V1/V2: Vall20-Pro-Gly-Ile201 


Lysl21-Val200 


6 


9 


VI/V2: Lysl21-Ala-Pro-Val200 
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30 



Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/ Amino acid sequence 


Leul22-Serl99 


7 


10 


V1/V2: Leul22-GIy-Asn-Serl99 


Vall20-Thr202 


8 


11 


V1/V2: Vall20-Gly-GIy-Ala-Thr202 


Trp427-Gly431 


9 


12 


bsm: Trp427-Gly-Gly431 


Arg426-Gly43! 


10 


13 


bsm: Arg426-Gly-Gly-Gly431 


Arg426-Gly431B 


11 


14 


bsm: Arg426-Gly-Ser-Gly431 


Arg426-Lys432 


12 


15 


bsm: Arg426-Gly-Gly-Asn-Lys432 


Asn425-Lys432 


1 J 


1 A 
1 0 




Ile424-Aia433 


14 


17 


bsm: Ile424-Gly-G!y-Ala433 


Ile423-Met434 


15 


18 


bsm: Ile423-Gly-Gly-Met434 


Gln422-Tvr435 


16 


19 


bsm: Gln422-GIy-Gly-Tyr435 


Vall27-Asnl95 


25 


28 


bsm: Vall27-Gly-Ala-GIy-Asnl95 


Gln422-Tyr435B 


17 


20 


bsm: Gln422-Ala-Pro-Tyr435 


Leul22-Serl99; 
Arg426-Gly431 


IS 


21 


Vi/V2/bsm: Leul22-Gly-Asn-Serl99 — Arg426- 
Gly-Gly-Gly431 


Leul22-Serl99; 
Arg426-Lys432 


19 


22 


Vl/V2/bsm: Leul22-Gly-Asn-Serl99 »- Arg426- 
G!y-Gly-Asn-Lys432 


Leul22-Serl99-Trp427- 
Gly431 


20 


23 


Vl/V2/bsm: Leul22-GIy-Asn-Serl99 — Trp427- 
Gly.Gly431 


Lysl21-Val200- 
Asn425-Lys432 


2! 


24 


Vi/V2/bsm: Lysl 21 -Ala-Pro- Val200 — Asn425- 
Ala-Pro-Lys432 


Vall20-Ile201-Ile424- 
Ala433 


22 


25 


Vl/V2/bsm: VaI120-Gly-GIy-Ile201 -- Ue424- 
Gly-Gly-Ala433 


Vall20-Ile201B-Ile424- 
Ala433 


23 


26 


Vl/V2/bsm: Vall20-Pro-GIy-Ile201 — Ile424- 
Gly-Gly-Ala43 


Van20-Thr202;Ile424- 
Ala433 


24 


27 


Vl/V2/bsm: Vall20-G!y-Gly-Ala-Thr202 ™ 
IIe424-Gly-Gly-Ala433 


Vall27-Asnl95; 
Arg426-Gly431 


25 


29 


Vl/V2/bsm: Vall27-Gly-Ala-Gly-Asnl95 — 
Arg426-Gly-Gly-GIy43I 



Combinations of VI /V2 deletions and bridging sheet small loop modifications in 
addition to those specifically shown in Table 9 are also within the scope of the present 
invention. Various forms of the different embodiments of the invention, described herein, 
may be combined. 
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The first screening will be done after transient expression in COS-7, RD and/or 293 
cells. The proteins that are expressed will be analyzed by immunoblot, ELISA, and for 
binding to mAbs directed to the CD4 binding site and other important epitopes on gpl20 to 
determine integrity of structure. They will also be tested in a CD4 binding assay and, in 
5 addition, the binding of neutralizing antibodies, for example using patient sera or mAb 448D 
(directed to Glu370 and Tyr384, a region of the CD4 binding groove that is not altered by the 
deletions). 

The immunogenicity of these novel Env glycoproteins will be tested in rodents and 
primates. The structures will be administered as DNA vaccines or adjuvanted protein 
10 vaccines or in combined modalities. The goal of these vaccinations will be to archive broadly 
reactive neutralizing antibody responses. 
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What is claimed is: 

5 1 . A polynucleotide encoding a modified HIV Env polypeptide wherein the 

polypeptide has at least one amino acid deleted or replaced in the region corresponding to 
residues 420 to 436 relative to HXB-2 (SEQ ID NO:l). 

2. The polynucleotide of claim 1, wherein the region corresponding to residues 124- 
10 198 relative to HXB-2 is deleted and at least one amino acid is deleted or replaced in the 

regions corresponding to the residues 1 19 to 123 and 199 to 210 relative to HXB-2 (SEQ ID 
NO:l). 

3. The polynucleotide of claim 1, wherein at least one amino acid in the region 

1 5 corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO: 1) is deleted or 
replaced. 

4. The polynucleotide of claim 2, wherein at least one amino acid of the in the region 
corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO:l) is deleted or 

20 replaced. 

5. The polynucleotide of claim 1 , wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

25 6. An immunogenic modified HTV Env polypeptide having at least one amino acid 

deleted or replaced in the region corresponding to residues 420 through 436, relative to HXB- 
2 (SEQ ID NO:l). 

7. The polypeptide of claim 6, wherein one amino acid is deleted in the region 
30 corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO: 1). 
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8. The polypeptide of claim 6, wherein more than one amino acid is deleted in the 
region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 



9. The polypeptide of claim 6, wherein at least one amino acid is replaced in the 
5 region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO: 1). 

10. The polypeptide of claim 6, wherein at least one amino acid residue between 
about amino acid residue 427 and amino acid residue 429 relative to HXB-2 (SEQ ID NO:l) 
is deleted or replaced. 

10 

1 1 . The polypeptide of claim 6, wherein the VI and V2 regions of the polypeptide are 
truncated. 

12. The polypeptide of claim 10, wherein the VI and V2 regions of the polypeptide 
15 are truncated. 

13. The polypeptide of claim 6, wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

20 14. A construct comprising the nucleotide sequence depicted in Figure 6 (SEQ ID 

NO:3). 

15. A construct comprising the nucleotide sequence depicted in Figure 7 (SEQ ID 

NO:4). 

25 

16. A construct comprising the nucleotide sequence depicted in Figure 8 (SEQ ID 

NO:5). 

1 7. A construct comprising the nucleotide sequence depicted in Figure 9 (SEQ ED 

30 NO:6). 
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18. A construct comprising the nucleotide sequence depicted in Figure 10 (SEQ ID 

NO:7). 

19. A construct comprising the nucleotide sequence depicted in Figure 1 1 (SEQ ID 

5 NO: 8). 

20. A construct comprising the nucleotide sequence depicted in Figure 12 (SEQ ID 

NO:9). 

10 2 1 . A construct comprising the nucleotide sequence depicted in Figure 1 3 (SEQ ID 

NO: 10). 

22. A construct comprising the nucleotide sequence depicted in Figure 14 (SEQ ID 
NO: 11). 

15 

23. A construct comprising the nucleotide sequence depicted in Figure 15 (SEQ ID 
NO: 12). 

24. A construct comprising the nucleotide sequence depicted in Figure 1 6 (SEQ ID 
20 NO: 13). 

25. A construct comprising the nucleotide sequence depicted in Figure 1 7 (SEQ ID 
NO: 14). 

25 26. A construct comprising the nucleotide sequence depicted in Figure 1 8 (SEQ ID 

NO: 15). 

27. A construct comprising the nucleotide sequence depicted in Figure 1 9 (SEQ ID 
NO: 16). 

30 

28. A construct comprising the nucleotide sequence depicted in Figure 20 (SEQ ID 
NO: 17). 
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29. A construct comprising the nucleotide sequence depicted in Figure 21 (SEQ ID 
NO: 18). 

30. A construct comprising the nucleotide sequence depicted in Figure 22 (SEQ ID 
5 NO: 19). 

31. A construct comprising the nucleotide sequence depicted in Figure 23 (SEQ ID 
NO:20). 

10 32. A construct comprising the nucleotide sequence depicted in Figure 24 (SEQ ID 

NO:21). 

33. A construct comprising the nucleotide sequence depicted in Figure 25 (SEQ ID 
NO:22). 

15 

34. A construct comprising the nucleotide sequence depicted in Figure 26 (SEQ ID 
NO:23). 

35. A construct comprising the nucleotide sequence depicted in Figure 27 (SEQ ID 
20 NO:24). 

36. A construct comprising the nucleotide sequence depicted in Figure 28 (SEQ ID 
NO:25). 

25 37. A construct comprising the nucleotide sequence depicted in Figure 29 (SEQ ID 

NO:26). 

38. A vaccine composition comprising a polynucleotide encoding a modified Env 
polypeptide according to any one of claims 1-5. 

30 

39. A vaccine composition comprising a polynucleotide construct encoding a 
modified Env polypeptide according to any of claims 14-37. 
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40. A vaccine composition comprising a modified Env polypeptide according to any 
of claims 6-13. 



41. The vaccine composition of any of claims 38-40, further comprising an adjuvant. 

5 

42. A method of inducing an immune response in subject comprising, administering a 
polynucleotide according to any one of claims 1-5 in an amount sufficient to induce an 
immune response in the subject. 

10 43. A method of inducing an immune response in subject comprising, administering a 

polynucleotide construct according to any one of claims 14-37 in an amount sufficient to 
induce an immune response in the subject. 

44. A method of inducing an immune response in a subject comprising administering 
15 a composition comprising a modified Env polypeptide according to any one of claims 6-13, 

wherein the composition is administered in an amount sufficient to induce an immune 
response in the subject 

45. The method of any of claims 42-44 further comprising administering an adjuvant 
20 to the subject. 

46. A method of inducing an immune response in a subject comprising 

(a) administering a first composition comprising a polynucleotide according to any of 
claims 1-5 in a priming step and 
25 (b) administering a second composition comprising a modified Env polypeptide 

according to any of claims 6-13, as a booster, in an amount sufficient to induce an immune 
response in the subject. 

47. The method of claim 46 wherein the first composition or second composition 
30 further comprise an adjuvant. 
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48. The method of claim 46 wherein the first and second compositions further 
comprise an adjuvant. 
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CGGCGAGAAGTGjGAAQ^CACCCTGAAGCAGATCGTGACC 
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CGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 
841 880 
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Vall20-Ile201B-Ile4 24-Ala4 33 

Consensus 
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Vall27-Asnl95-Arg4 26-Gly4 31 
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Leul22-Serl99-Arg426-Lys4 32 
Leul22-Serl99-Arg4 26-Gly4 31 
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Vall20-Ile201-Ile4 24-Ala4 33 
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Leul22-Serl99-Arg4 26-Lys4 32 
Leul22-Serl99-Arg426-Gly4 31 
Lysl21-Val200-Asn425-Lys4 32 
Vall20-Ile201-Ile4 24-Ala4 33 
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Consensus 



Leul22-Serl99 
Vall27-Asnl95 
Vall20-Thr202 
Leul22-Serl99 
Leul22-Serl99 
Lysl21-Val200 
Vall20-Ile201 

Vall20-Ile201B 



Tryp427-Gly431 
-Arg426-Gly431 
-Ile424-Ala433 
-Arg426-Lys432 
-Arg426-Gly431 
-Asn425-Lys432 
-Ile424-Ala433 
-Ile424-Ala433 
Consensus 




(2041) 
(2071) 
(2017) 
(2041) 
(2041) 
(2029) 
(2017) 
(2017) 
(2071) 

(2071) 
(2101) 
(2047) 
(2071) 
(2071) 
(2059) 
(2047) 
(2047) 
(2101) 



(2101 
(2131) 
(2077) 
(2101) 
(2101) 
(2089) 
(2077) 
(2077) 
(2131) 

(2131) 
(2161) 
(2107) 
(2131) 
(2131) 
(2119) 
(2107) 
(2107) 
(2161) 

(2161) 
(2191) 
(2137) 
(2161) 
(2161) 
(2149) 
(2137) 
(2137) 
(2191) 



ATCTGGGACGACCTGCGCAGCCTGTGCCTG 
2071 2100 




TTCAGCTACCACCGCCTGCGCGACCTGATC 
2101 2130 





CTGATCGCCGCCCGCATCGTGGAGCTGCTG 
2131 2160 




GGCCGCCGCGGCTGGGAGGCCCTGAAGTAC 
2161 2190 



TGGGGCAACCTGCTGCAGTACTGGATCCAG 
2191 2220 




GAGCTGAAGAACAGCGCCGTGAGCCTGTTC 
2221 2250 



FIG. 5M 



WO 00/39303 



41 



/ 65 



PCT/US99/31272 
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SEQ ID NO:3 VAL120-ALA204 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgggcgccggcgcctgccccaa 

ggtgagcttcgagcccatccccatccactactck:gcccccgccggcttcgccatcxt 

caacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcaccc 

acggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggc 

gtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatggtgcagctgaagga 

gagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcc 

ccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaaca 

tcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 

ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacag 

citcaactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaa 

caacaccatcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcaga 

tcatcaaccgctggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatc 

cgctgcagcagcaacatcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaa 

caccaccgagatcttccgccccggcgck:ggcgacatgcgcgacaactggcgcagcgagctgt 

acaagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaaggccaagcgccgc 

gtggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgcc 

gccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccagctgctgag 

cggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctgctgc 

agctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 

aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgt 

gccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacctgga 

tggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggagagc 

cagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgt 

ggaactggttcgacatcagcaagtggctgtggtacatcaagatcttcatcatgatcgtgggcg 

gcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggct 

acagccccctgagcttccagacccgcxtccccgccccccgcggccccgaccgccccgagggca 

tcgaggaggaggck:ggcgagcgcgaccgcgaccgcagcagcccx:ctggtgcacggcctgctg 

gccctgatctgggacgacctgcgcagcctgtgccrgttcagctaccaccgcctgcgcgacctg 

atcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggaggccctgaagtac 

tggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtgagcctgttcga 

cgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcg 

gccgcgccitcctgcacatcccccgccgcatccgccagggctrcgagcgcgccctgctgtaac 

TCGAG 
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SEQ ID NO:4 VAL120-ILE201 

GAATTCGCCACCATGGATGCAATGAAGAGAGG 

GTCTTCGTTTCGGCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TCK3AAGGAGGCCACCACCACCCTGTTCTGCCK:CAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCOVACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCCAGGCCTG 

ccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcct 

gaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagt 

gcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagct 

gaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacca 

tcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccact 

ck;aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcc 

cagttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgat 

gcacagcttcaactgcggcggcgagttcttctactgcaacagcacccagctgttcaa 

ctggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgccctgccgcatca 

agcagatcatcaaccgctggcaggaggtgggcaaggccatgtacgccccccccatccgcggc 

cagatccgctgcagcagcaacatcaccggcctgctgctgacccgcgacggcggcaaggagat 

cagcaacaccaccgagatcttccgcccx;ggcggcggcgacatgcgcgacaactggcgcagcg 

agctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaaggccaag 

cgccgcgtggtgcagcgcgagaagcgcgccgtg^ 

ggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccagct 

gctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacc 

tgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgc 

tacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccac 

cgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatga 

cctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgag 

gagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggcca 

gcctgtggaactggttcgacatc^gcaagtggctgtggtacatcaagatcttcatcatgatcg 

tgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgcc 

agggctacagccccctgagcttccagacccgcttcx;ccgccccccgcggccccgaccgccccg 

agggcatcgaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggtgcacgg 

cctgctggccctgatctgggacgacctgcgcagcctgtgcctgtrcagct 

cgacctgatcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggaggccct 

gaagtactggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtgagcc 

tgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcccagc 

gcatcggccgcgccttcctgcacatcccccgccgcatccgccagggcttcgagcgcgccctgc 

TGTAACTCGAG 
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SEQ ID NO:5 VAL120-ILE201B 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtg^ 

tttcgccc^gcgccgtggagaagctgtgggtgacx;gtgtactacggcgtgcccgtgtggaaggaggcca 

ccaccaccctgttctgcckxagcgacgccaaggcctacgacaccgaggtc^ 

ACGCCHTjCGTGCCCA(X:GACCCCAACCCCCAGGAGATCGTGCT^ 

TGTGGAAGAACAACATGGTGGAGCAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGC 
CCTGCGTGCCCGGCATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGC 
CCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCX^CCTGCACCAACXjT 

gagcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcct 

ggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagct 

gaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccc 

(xgccgcgccttctacgccaccggcgacatcatcggcgagatccgccaggcccactgcaacatcagcggc 

gagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcx3gcaacaagaccatc 

GTGTTCAAGCAGAGCAGCGGCGGCGACCX^CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTC 
TTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

ggcaccatcaccctgccctckx:gcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 
tacgccccccccatccgcgckx^agatccgctgcagcagcaacatcaccggcctgctgctgacccgcgacg 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGC 

GCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGC 

GCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCITCCT 

CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGT 

GCAGCAGCAGAACAACCIXJCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

CATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCAT 

CTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAG 

CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCT 

GATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGG 

ACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTXjCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAG 

GGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCG 

AGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCT 

GGGACGACCTGCGCAGCXrrGTGCCTGTrCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 

CATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTG 

GATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCAC 

CGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCXrCCCGCCGCATCCGCCAG 

GGCTTCGAGCGCGCCCTGCTGTAACTCGAGCGTGCT 
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SEQ ID NO:6 LYS121-VAL200 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCXAACCXJCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCX^AAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCGCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTC^ 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

gtgatgcacagcttcaactgcggcggcgagttctix^ 

agcacctggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgccctgccg 
catcaagcagatcatcaaccgctggcaggaggtgggcaaggccatgtacgccccccccatcc 
gcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgcgacggcggcaag 
gagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgacaactggcg 
cagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaagg 
ccaack:gccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggc 
ttcchxjGgcgccgccggcagcaccatgggcgcccgcagc^ 

cagctgctgagcckk:atcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagca 

gcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtgg 

agcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaa 

catgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctga 

tcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtg 

ggccagcctgtggaactggttcgacatcagcaagtggctgtggtacatcaagatcttcatc^ 

gatcgtgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgt 

gcgccagggctacagccccctgagcttccagacccgcttccccgccccccgcggccccgaccg 

ccccgagggcatcgaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggtgc 

acggcctgctggccctgatctgggacgacctgcgc^ 

tgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggagg 

ccctgaagtactggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtg 

agcctgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcc 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCACKjGCTTCGAGCGCGCC 
CTGCTGTAACTCGAGCGTGCT 
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SEQ ID NO:7: LEU122-SER199 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATG.CACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAAC\ACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCITCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTG<XGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCX; 

CCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGC 

GGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAA 

CTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCA 

CCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTC 

CTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAG 

GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGC 

CCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGG 

CCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 

ATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTG 

GAACAACATGACCTGGATGGAGTCKjGAGCGCGAGATCGACAACTACACCAACCTGATCTACA 

ccctgatcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctgga 
caagtggck:cagcctxjTggaactggtc 

catcatgatcgtgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaa 

CCGCGTGCGCCAGGGCTACAGCCCCCTGAGCITCCAGACCCGCTTCCCCGCCCCCCGCCKj^ 

CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCC 

CTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTAC 

CACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGC 

TGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAG 

CGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGA 

GGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGA 

GCGCGCCCTGCTGTAACTCGAGCGTGCT 
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SEQ ID NO:8 VAL120-THR202 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCGCCACCCAGGCCrG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGC^CGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCITCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCITCTACTGCAACAGCACCCAGCTGTTCAACAG 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCCKK:CTGCTXKTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTG 

CKjCGCCGCCGGCAGCACCATGGGCGCCCGOVGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCG 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCG 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCC 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCITCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCGAG 
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SEQ ID NO:9 TRP427-GLY431 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 
gtcttcgtttcgcccagcgccgtggagaagc^ 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgacccccxrixrrgcgtg 

accctgcactgcaccaacctgaagaacgccaccaacaccaagagcagcaactggaaggagat 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 

agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 

tacaagctgatcaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcttcga 

gcccatccccatccactactxkiigcccccgccggcttcgccatcctgaagtgcaacgacaagaa 

gttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgcc 

ccgtggtgagcacccagcixjctgctgaacggcagcctggccgaggagggcgtggtgatccgc 

agcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagat 

caactgcacccgccx:caacaacaacacccgcaagagcatcaccatcggcccx:ggccgcgcct 

tctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgag 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCT 

GGGGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATC 

ACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCG 

CCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGA 

AGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 

CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGC 

GCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCA 

GAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCA 

TCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTG 

GGCATCTGGGGCTGCAGCGGCAAGCTXjATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTG 

GAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAG 

ATCGACAACTACACCAACCTGATCTA^CCCTGATCGAGGAGAGCCAGAACCAGCACKjAGAA 

GAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAyACTGGTTCGACATCA 

GCAAGTGGCTGTGGTAC^TCAAGATCTTCATCATGATCGTGGGCGGCCTG^ 

TCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCC 

AGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGC 

GAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGA 

CCTGCGCAGCCTGTGCCTGTrcAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 

CATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGC 

AGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCC 

GTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCA 

CATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO: 10 ARG426-GLY431 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctg 

gtcttcgtttcck:ccack:gccgtggagaagctgtgggtgaccgtgtacta 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtg 

accctgcactgcaccaacctgaagaacgccaccaacaccaagagcagcaactggaaggagat 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 

agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 

tacaagctgatcaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcttcga 

gcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgo^ 

gttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgcc 

ccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgc 

agcgagaacttcaccgacaacgccaagactatcatcgtgcagctgaaggagagcgtggagat 

caactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgcct 

tctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgag 

aagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagac 

catcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcg 

gcggcgagttcttctactgcaacagcacccagctgtrcaacagcacctggaacaacaccatcg 

gccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 

ggcggcggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacat 

caccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttcc 

gccccggcggcggcgacatgcgcgacaactgck:gcagcgagctgtacaagtacaaggtggtg 

aagatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaa 

gcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatggg 

cgcccgcagcctgaccctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagc 

agaacaacctgctgcgcgccatcgaggcccagcagcacctgctgcagctgaccgtgtggggc 

atcaagcagctgcaggcccgcgtgctggccgtggagcgctacctgaaggaccagcagctgct 

gggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccctggaacgccagct 

ggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgggagcgcga 

gatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcaggaga 

agaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatc 
agcaagtggctgtckjtacatcaagatcttc^ 

atcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggctacagccccctgagcttc 

cagacccgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggagggcgg 

cgagcgcgaccgcgaccgcagcagccccctggtgcacggcctgctggccctgatctgggacg 

acctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgccc 

gcatcgtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctgctg 

cagtactggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgc 

cgtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccck:gccitcctgc 

acatcccccgccgcatccgccagggcitcgagcgcgccctgctgtaactcgag 
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SEQ ID NO:ll ARG426-GLY431B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 
GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 
TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 
GCACAACGTXjTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagack:ctgaagccctgcgtgaagctgacccccctgtgcgtg 

accctgcactgcaccaacctgaagaacgccaccaacaccaagagcagcaactggaaggagat 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 

agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 

tacaagctgatcaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcttcga 

gcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgcaacgacaagaa 

gttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgcc 

ccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgc 

agcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagat 

caactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgcct 

tctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgag 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCAGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCCKjCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCIXjCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

cagtactggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgc 
cgtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcctgc 
acatcccccgo:gcatccgccagggcttcgagcgcgccctgctgtaactcgag 
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SEQ ID NO:12 ARG426-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTXjAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGT^ 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCAACAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACX^GATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTCK3AGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:13 ASN425-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACGCCC 

CCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCC 

TGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGC 

GGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGA 

GCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCG 

TGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCA 

GCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAAC 

CTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCA 

GCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCT 

GGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAAC 

AAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAA 

CTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGC 

AGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGG 

CTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTC 

ACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGC 

TTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGA 

CCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAG 

CCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGA 

GCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGA 

TCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAG 

GGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGC 

CGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO: 14 ILE424-ALA433 

gaattcgccaccatggatgcaatgaagagagggctct^ 
gtcttcgtttcgcccagcgccgtggagaagctgtggg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacctacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtg 

accctgcactgcaccaacctgaagaacgccaccaacaccaagagcagcaactggaaggagat 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 

agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 

tacaagctgatcaactgcaacaccagcgtgatcacccaggcctgccccaj\ggtgagcttcga 

gcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgcaacgacaagaa 

gttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgcc 

ccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgc 

agcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagat 

caactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgcct 

tctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgag 

aagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagac 

catcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcg 

gcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcg 

gccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcggcggc 

gccatgtacgccccccccatccgcggccagatccgctgcagcack:aacatcaccggcctgctg 

ctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcgg 

cgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccc 

tgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgacc 

ctgggcgccatgttcctgggcttcxrrgggcgccgccggcagcaccatgggcgcccgcagcctg 

accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgct 

gcgcgccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctcc 

aggcccgcgtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggc 

tgcagcggcaagctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagag 

cctggaccagatctggaacaacatgacctggatggagtgggagcgcgagatcgacaactaca 

ccaacctgatctacaccctgatcgaggagagccagaaccagcaggagaagaacgagcagga 

gctgctggagctggacaagtgggccagcctgtggaactggttcgacatcagcaagtggctgt 

ggtacatcaagatcttcatcatgatcgtcx^^ 

tgctgagcatcgtgaaccgcgtgcgccagggctacagccccctgagcttccagacccgcttcc 

ccgccccccgcggccccgaccgccccgagggcatcgaggaggagggcggcgagcgcgaccgc 

gaccgcagcagccccctggtgcacggcctgctggccctgatctgggacgacctgcgcagcctg 

tgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgcccgcatcgtggagctg 

ctgggccgccgcggctgggaggccctgaagtactggggcaacctgctgcagtactggatcca 

ggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgccgtggccgagggca 

ccgaccgcatcatcgaggtggcccagc(k:atcggccgcgccntcctck:acatcccccgccgca 

tccgccagggcttcgagcgcgccctgctgtaactcgag 
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SEQ ID NO:15 ILE423-MET434 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgcc^aggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgaccccccxgtgcgtg 

accctgcactgcaccaacctgaagaacgccaccaacaccaagagcagcaactggaaggagat 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 

agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 

tacaagctgatcaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcttcga 

gcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgcaacgacaagaa 

gttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggcatccgcc 

ccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatccgc 

agcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagat 

caactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgcct 

tctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgag 

aagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagac 

catcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcg 

gcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcg 

gccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcggcggcatg 

tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacc 

cgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacat 

gcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcg 

tggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggc 

gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctg 

accgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgc 

catcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggccc 

gcgtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagc 

ggcaagctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctgga 

ccagatctggaacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacc 

tgatctacaccctgatcgaggagagccagaaccagcaggagaagaacgagcaggagctgctg 

gagctggacaagtgggccagcctgtggaactggttcgacatcagcaagtggctgtggtacat 

caagatcttcatcatgatcgtgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgag 

catcgtgaaccgcgtgcgccagggctacack:cccctgagcttccagacccgcttccccgcccc 

ccgcggccccgaccgccccgagggcatcgaggaggagggcggcgagcgcgaccgcgaccgc 

agcagccccctggtgcacggcctgctggccctgatctgggacgacctgcgcagcctgtgcctg 

ttcagctaccaccgcctgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctgggc 

cgccgcggctgggaggccxtrgaagtactggggcaacctgctgcagtactggatccaggagct 

gaagaacagcgccgtgagcctgttcgacgccatcgccatcgccgtggccgagggcaccgacc 

gcatcatcgaggtggcccagcgcatcggccgcgccntcctgcacatcccccgccgcatccgcc 

agggcttcgagcgcgccctgctgtaactcgag 
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SEQ ID NO: 16 GLN422-TYR435 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTX3CGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCXGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

ggaccgcggcgagatcaagaactgcagcttcaaggtgaccaccagcatccgcaacaagatgc 
agaaggagtacgccctgttctacaagctggacgtggtgcccatcgacaacgacaacaccagc 
tacaagctgatcaactgcaacaccagcgtgatcacccagck:ctgccccaaggtgagcttcga 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGGCGGCTACGCC 

CCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGAC 

GGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCC 

CCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATG 

TTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTG 

CAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 

GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGC 

TGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAG 

CTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGAT 

CTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCT 

ACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCT 

GGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGA 

TCTTCATCATGATCGTGGGCGGCXrrGGTGGGCCTGCGCATCGTGTrCACCGTGCTGAGCATCG 

TGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCG 

GCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAG 

CCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAG 

CTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCG 

CGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGA 

ACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATC 

ATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGC 

TTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:17 GLN422-TYR435B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCT 
GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCIXjTGGGTGACCGTGTACTACGG^ 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagack:ctgaagccctgcgtgaagctgacccccct^ 

accctgcactccaccaacctgaagaac^ 

CKjACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGTCGTGATCACCCAGGCCrGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCITCTACTGCAACAGCACCCAGCTGTC 

GCCCCAACAACACCAACGGCACCATCACCCTrc 

CCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 

AACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCC 

CACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGT 

TCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGC 

AGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 

GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCT 

GGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGC 

TGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATC 

TGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTA 

CACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTG 

GACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGAT 

CTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGT 

GAACCGCGTGCGCCAGGGCTACAGCCCCCra 

CCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGC 

CCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGC 

TACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGC 

GGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAA 

CAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCAT 

CGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTT 

CGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:18: LEU122-SER199; ARG426-GLY431 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtgg^ 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgggcaacagcgtgat 

cacccaggcctgccccaaggtgagcitcgagcccatccccatccactactgcgcccccgccgg 

cttcck:catcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtga 

gcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggc 

agcctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccat 

catcgtgcagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgca 

agagcatcaccatcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatcc 

gccaggcccactgcaacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgacc 

aagctgcaggcccagttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccc 

cgagatcgtgatgcacagcttcaactgcggcggcgagttcttctactgcaacagcacccagct 

gttcaacagcacctggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgc 

cctgccgcatcaagcagatcatcaaccgcggcggcggcaaggccatgtacgccccccccatcc 

gcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgcgacggcggcaag 

gagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgacaactggcg 

cagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaagg 

ccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggc 

ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgc 

cagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagca 

gcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtgg 

agcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaa 

catgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctga 

tcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtg 

ggccagcctgtggaactggttcgacatcagcaagtggctgtggtacatcaagatcttcatcat 

gatcgtgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgt 

gcgccagggctacagccccctgagcttccagacccckntccccgccccccgcggccccgaccg 

ccccgagggcatcgaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggtgc 

acggcctgctggccctgatctgggacgacctgcgcagcctgtgcctgttcagctaccaccgcc 

tgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggagg 

ccctgaagtactggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtg 

agcctgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcc 

cagcgcatcggccgcgccttcctgcacatcccccgccgcatccgccagggcttcgagcgcgcc 

ctgctgtaactcgag 
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SEQ ID NO:19 LEU122-SER199; ARG426-LYS432 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCnTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCXjCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGTCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCGGCGGCAACAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

agcgctacctgaaggaccagcagctgctgggcatctc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaa 

catgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctga 

tcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtg 

ggccagcctgtggaactggttcgacatcagcaagtggctgtggtacatcaagatcttcatcat 

gatcgtgggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgt 

gcgccagggctacagccccctgagcttccagacccgcttccccgccccccgcggccccgaccg 

ccccgagggcatcgaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggtgc 

acggcctgctggccctgatctgggacgacctgcgcagcctgtgcctgttcagctaccaccgcc 

tgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctgggccgccgcggctgggagg 

ccctgaagtactggggcaacctgctgcagtactggatccaggagctgaagaacagcgccgtg 

agcctgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatcatcgaggtggcc 

cagcgcatcggccgcgccttcctgcacatcccccgccgcatccgccagggcttcgagcgcgcc 

ctgctgtaactcgag 
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SEQ ID NO: 20: LEU122-SER199; TRP427-GLY431 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagca 

gtcttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcck:cagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgaagctgggcaacagcgtgat 

cacccaggcctcccccaaggtgagcttcgagcccatccccatccactactgcgcc^ 

cttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtga 

gcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggc 

agcctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccat 

catcgtgcagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgca 

agagcatcaccatcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatcc 

gccaggcccactgcaacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgacc 

aagctgcaggcccagttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccc 

cgagatcgtgatgcacagcttcaactgcggcggcgagttcttctactgcaacagcacccagct 

gttcaacagcacctggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgc 

cctgccgcatcaagcagatcatcaaccgctggggcggcaaggccatgtacgccccccccatcc 

gcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgcgacggcggcaag 

gagatcagcaacaccaccgagatcitccgccccggcggcggcgacatgcgcgagaactggcg 

cagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaagg 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 
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WO 00/39303 _ PCI7US99/31272 
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SEQ ID NO:21 LYS121-VAL200; ASN425-LYS432 



gaattcgccaccatggatgcaatgaagagagogctctgctgtgtgctgctgctgtgtgoagca 

gtcitcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtg 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggt 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcack:ctgtgggaccagagcctgaagccctgcgtgaaggcccccgtgatcaccca 

ggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgc 

catcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccg 

tgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctgg 

CCGACKjAGGGCGTCKjTGATCCGCAGCGAGAAOT 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcat 

caccatcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggc 

ccactgcaacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgc 

aggcccagttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatc 

gtgatgcacagcttcaactgcggcggcgagttcttctactgcaacagcacccagctgttcaac 

agcacctggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgccctgccg 

catcaagcagatcatcaacgcccccaaggccatgtacgccccccccatccgcggccagatccg 

ctgcagcagcaacatcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaaca 

ccaccgagatcttccgcccx:ggcggcggcgacatck;gcgacaactggcgcagcgagctgtac 

aagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgt 

ggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgc 

CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCG 

GCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAG 

CTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAA 

GGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGC 

CCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATG 

GAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCA 

GAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGG 

AACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGC 

CTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTAC 

AGCCCCCTGAGCTTCCAGACCCGCTTCXCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATC 

GAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGC 

CCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGAT 

CCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTG 

GGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACG 

CCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGC 

CGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTC 

GAG 
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WO 00/39303 fil f 6 g PCT/US99/31272 

SEQ ID NO:22 VAL120-ILE201; ILE 424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCX:CCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTICTACTGCAACAGCAC^ 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTrCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCXrCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGK3AGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:23: VAL120-ILE201B; ELE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAG(KjCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGCCCGGCATCACCXAGGCCTGC 

CCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTG 

AAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTG 

CACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGG 

AGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTG 

AAGGAGAGCGTGGAGATCAACTGCACCCGCCTCAACAACAACACCCGCAAGAGCATCACCAT 

CGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

CAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCC 

AGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATG 

cacagcttcaactgcggcggcgagttcttctactgcaacagcaccc^gctgttcaacagcacc 

tggaacaacaccatcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaa 

gcagatcatcggcggcgccatgtacgccccccc^ 

acatcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatc 

ttccgccccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggt 

ggtgaagatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcg 

agaagcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgccggcagcacca 

tgggcgcccgcagcctgaccctgaccgtgcaggcccgccagctgctgagcggcatcgtgcag 

cagcagaacaacctgctgcgcgccatcgaggcccagcagcacctgctgcagctgaccgtgtg 

gggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctgaaggaccagcagc 

tgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccctggaacgcca 

gctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgggagcgc 

gagatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcagga 

gaagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgaca 

tcagcaagtggctgtggtacatcaagatcttcatcatgatcgtgggcggcctggtgggcctgc 

gcatcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggctacagccccctgagct 

tccagacccgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggagggc 

ggcgagcgcgaccgcgaccgcagcagcccxctggtgcacggcctgctggccctgatctggga 

cgacctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgc 

ccgcatcgtggagctgctgggccgccgcckjctgggaggccctgaagtactggggcaacctgc 

tgcagtactggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatc 

gccgtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcct 

gcacatcccccgccgcatccgccagggcttcgagcgcgccctgctgtaactcgag 
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SEQ ID NO:24 VAL120-THR202; ILE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 
GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACrACGGCGTGCCCGTG 

tggaaggaggccaccaccaccctgttctgcgccagcgacgccaagck:ct^ 

gcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgct 

ggagaacgtgaccgagaacttcaacatgtggaagaacaacatggtggagcagatgcacgag 

gacatcatcagcctgtgggaccagagcctgaagccctgcgtgggcggcgccacccaggcctg 

ccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcct 

gaagtgcaacgacaagaagttcaacggcagcggccccixjcaccaacgtgagcaccgtgcact 

gcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcack:gagaacttcaccgacaacgccaagaccatcatcgtgcagct 

gaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacca 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

gcaacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcc 

cagttcggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgat 

gcacagcttcaactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcac 

ctggaacaacaccatcgck:cccaacaacaccaacggcaccatcaccctgccctgccgcatca 

agcagatcatcggcggcgccatgtacgccccccxcatccgcggccagatccgctgcagcagc 

aacatcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagat 

cttccgccccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaagg 

tggtgaagatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgc 

gagaagcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgccggcagcacc 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTCCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCITCATCATGATCGTGGGCGGCCnXKn'GGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGC^GCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGACKjGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:2S VAL127-ASN195 

GAATTCGCCACCATGOATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTT 

CAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAA 

CTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

CGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCC 

CAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGC 

AGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAAC 

ATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTT 

CCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGG 

TGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAG 

AAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATG 

GGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCA 

GCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

GCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTG 

CTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAG 

CTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCG 

AGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAG 

AAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACAT 

CAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCG 

CATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTT 

CCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCG 

GCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGAC 

GACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCC 

CGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCT 

GCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCG 

CCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:26 VAL127-ASN19S; ARG426-GLY431 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGC 

CAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAA 

CTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATXXGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

GG AACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

CGAG1TCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCC 

CAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCGGCG 

GCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACC 

GGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCC 

CGGGGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAG 

atcgagcccctgggcgtggcccccaccaaggcxraagcgccgcgtggtgcagcgcgagaagcg 
cgccgtgaccctgggcgccatgttcctgggcitc^ 

CCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGA 

ACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATC 

AAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGG 

CATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGA 

GCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATC 

GACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAA 

CGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCA 

AGTGGCTGTGGTACATGAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATC 

TGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGA 

CCCGKJTTCCCCGCCCCCCGCGGCCCCGACX^GCCCCGAGGGCATCGAGGAGGAGGGCGGCGAG 

CGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTG 

CGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATC 

GTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTA 

CTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGG 

CCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCC 

CCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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<110> Chiron Corporation 

<120> MODIFIED HIV ENV POLYPEPTIDES 

<130> 1605.100 

<140> 
<141> 

<160> 26 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 856 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 1 

Met Arg Val Lys Glu Lys Tyr Gin His Leu Trp Arg Trp Gly Trp Arg 
15 10 15 

Trp Gly Thr Met Leu Leu Gly Met Leu Met lie Cys Ser Ala Thr Glu 
20 25 30 

Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala 
35 40 45 

Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu 
50 55 60 

Val His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn 
65 70 75 80 

Pro Gin Glu Val Val Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp 

85 90 95 

Lys Asn Asp Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp 
100 105 110 

Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Ser 
115 120 125 

Leu Lys Cys Thr Asp Leu Lys Asn Asp Thr Asn Thr Asn Ser Ser Ser 
130 135 140 

Gly Arg Met He Met Glu Lys Gly Glu He Lys Asn Cys Ser Phe Asn 
145 150 155 160 

He Ser Thr Ser He Arg Gly Lys Val Gin Lys Glu Tyr Ala Phe Phe 
165 170 175 

Tyr Lys Leu Asp He He Pro He Asp Asn Asp Thr Thr Ser Tyr Lys 
180 185 190 
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Leu Thr Ser Cys 
195 

Ser Phe Glu Pro 
210 

lie Leu Lys Cys 
225 

Asn Val Ser Thr 



Thr Gin Leu Leu 
260 

Arg Ser Val Asn 
275 

Asn Thr Ser Val 
290 

Lys Arg lie Arg 
305 

Gly Lys He Gly 



Lys Trp Asn Asn 
340 

Phe Gly Asn Asn 
355 

Pro Glu He Val 
370 

Cys Asn Ser Thr 
385 

Ser Thr Glu Gly 



Pro Cys Arg He 
420 

Ala Met Tyr Ala 
435 

He Thr Gly Leu 
450 

Ser Glu He Phe 
465 

Ser Glu Leu Tyr 



Ala Pro Thr Lys 
500 



Asn Thr Ser Val 
200 

He Pro lie His 
215 

Asn Asn Lys Thr 
230 

Val Gin Cys Thr 
245 

Leu Asn Gly Ser 



Phe Thr Asp Asn 
280 

Glu He Asn Cys 
295 

He Gin Arg Gly 
310 

Asn Met Arg Gin 
325 

Thr Leu Lys Gin 



Lys Thr He He 
360 

Thr His Ser Phe 
375 

Gin Leu Phe Asn 
390 

Ser Asn Asn Thr 
405 

Lys Gin He He 



Pro Pro lie Ser 
440 

Leu Leu Thr Arg 
455 

Arg Pro Gly Gly 
470 

Lys Tyr Lys Val 
485 

Ala Lys Arg Arg 



He Thr Gin Ala 



Tyr Cys Ala Pro 
220 

Phe Asn Gly Thr 
235 

His Gly lie Arg 
250 

Leu Ala Glu Glu 
265 

Ala Lys Thr He 



Thr Arg Pro Asn 
300 

Pro Gly Arg Ala 
315 

Ala His Cys Asn 
330 

lie Ala Ser Lys 
345 

Phe Lys Gin Ser 



Asn Cys Gly Gly 
380 

Ser Thr Trp Phe 
395 

Glu Gly Ser Asp 
410 

Asn Met Trp Gin 
425 

Gly Gin lie Arg 



Asp Gly Gly Asn 
460 

Gly Asp Met Arg 
475 

Val Lys lie Glu 
490 

Val Val Gin Arg 
505 



Cys Pro Lys Val 
205 

Ala Gly Phe Ala 



Gly Pro Cys Thr 
240 

Pro Val Val Ser 
255 

Glu Val Val He 
270 

lie Val Gin Leu 
285 

Asn Asn Thr Arg 



Phe Val Thr lie 
320 

lie Ser Arg Ala 
335 

Leu Arg Glu Gin 
350 

Ser Gly Gly Asp 
365 

Glu Phe Phe Tyr 



Asn Ser Thr Trp 
400 

Thr lie Thr Leu 
415 

Lys Val Gly Lys 
430 

Cys Ser Ser Asn 
445 

Ser Asn Asn Glu 



Asp Asn Trp Arg 
480 

Pro Leu Gly Val 
495 

Glu Lys Arg Ala 
510 
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Val Gly lie Gly 
515 

Thr Met Gly Ala 
530 

Leu Ser Gly lie 
545 

Ala Gin Gin His 



Gin Ala Arg lie 
580 

Leu Gly lie Trp 
595 

Pro Trp Asn Ala 
610 

His Thr Thr Trp 
625 

Leu lie His Ser 



Glu Gin Glu Leu 
660 

Phe Asn lie Thr 
675 

Val Gly Gly Leu 
690 

Val Asn Arg Val 
705 

Leu Pro Thr Pro 



Gly Gly Glu Arg 
740 

Leu Ala Leu lie 
755 

His Arg Leu Arg 
770 

Leu Gly Arg Arg 
785 

Gin Tyr Trp Ser 



Ala Thr Ala He 
820 



Ala Leu Phe Leu 
520 

Ala Ser Met Thr 
535 

Val Gin Gin Gin 
550 

Leu Leu Gin Leu 
565 

Leu Ala Val Glu 



Gly Cys Ser Gly 
600 

Ser Trp Ser Asn 
615 

Met Glu Trp Asp 
630 

Leu He Glu Glu 
645 

Leu Glu Leu Asp 



Asn Trp Leu Trp 
680 

Val Gly Leu Arg 
695 

Arg Gin Gly Tyr 
710 

Arg Gly Pro Asp 
725 

Asp Arg Asp Arg 



Trp Asp Asp Leu 
760 

Asp Leu Leu Leu 
775 

Gly Trp Glu Ala 
790 

Gin Glu Leu Lys 
805 

Ala Val Ala Glu 



Gly Phe Leu Gly 



Leu Thr Val Gin 
540 

Asn Asn Leu Leu 
555 

Thr Val Trp Gly 
570 

Arg Tyr Leu Lys 
585 

Lys Leu He Cys 



Lys Ser Leu Glu 
620 

Arg Glu He Asn 
635 

Ser Gin Asn Gin 
650 

Lys Trp Ala Ser 
665 

Tyr He Lys Leu 



He Val Phe Ala 
700 

Ser Pro Leu Ser 
715 

Arg Pro Glu Gly 
730 

Ser He Arg Leu 
745 

Arg Ser Leu Cys 



He Val Thr Arg 
780 

Leu Lys Tyr Trp 
795 

Asn Ser Ala Val 
810 

Gly Thr Asp Arg 
825 



Ala Ala Gly Ser 
525 

Ala Arg Gin Leu 



Arg Ala He Glu 
560 

lie Lys Gin Leu 
575 

Asp Gin Gin Leu 
590 

Thr Thr Ala Val 
605 

Gin He Trp Asn 



Asn Tyr Thr Ser 
640 

Gin Glu Lys Asn 
655 

Leu Trp Asn Trp 
670 

Phe He Met He 
685 

Val Leu Ser He 



Phe Gin Thr His 
720 

He Glu Glu Glu 
735 

Val Asn Gly Ser 
750 

Leu Phe Ser Tyr 
765 

lie Val Glu Leu 



Trp Asn Leu Leu 
800 

Ser Leu Leu Asn 
815 

Val lie Glu Val 
830 
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Val Gin Gly Ala Cys Arg Ala lie Arg His lie Pro Arg Arg lie Arg 
835 840 845 

Gin Gly Leu Glu Arg lie Leu Leu 
850 855 



<210> 2 
<211> 847 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 2 

Met Arg Val Lys Gly He Arg Lys Asn Tyr Gin His Leu Trp Arg Gly 
15 10 15 

Gly Thr Leu Leu Leu Gly Met Leu Met He Cys Ser Ala Val Glu Lys 
20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 

Gin Glu He Val Leu Glu Asn Val Thr Glu Asn Phe Asn Met Trp Lys 

85 90 95 

Asn Asn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp 
100 105 HO 

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

His Cys Thr Asn Leu Lys Asn Ala Thr Asn Thr Lys Ser Ser Asn Trp 
130 135 140 

Lys Glu Met Asp Arg Gly Glu He Lys Asn Cys Ser Phe Lys Val Thr 
145 150 155 160 

Thr Ser lie Arg Asn Lys Met Gin Lys Glu Tyr Ala Leu Phe Tyr Lys 
165 170 175 

Leu Asp Val Val Pro He Asp Asn Asp Asn Thr Ser Tyr Lys Leu He 
180 185 190 

Asn Cys Asn Thr Ser Val He Thr Gin Ala Cys Pro Lys Val Ser Phe 
195 200 205 

Glu Pro He Pro He His Tyr Cys Ala Pro Ala Gly Phe Ala He Leu 
210 215 220 

Lys Cys Asn Asp Lys Lys Phe Asn Gly Ser Gly Pro Cys Thr Asn Val 
225 230 235 240 
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Ser Thr Val Gin Cys Thr His Gly He Arg Pro Val Val Ser Thr Gin 
245 250 255 

Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Gly Val Val He Arg Ser 
260 265 270 

Glu Asn Phe Thr Asp Asn Ala Lys Thr He He Val Gin Leu Lys Glu 
275 280 285 

Ser Val Glu He Asn Cys Thr Arg Pro Asn Asn Asn Thr Arg Lys Ser 
290 295 300 

He Thr He Gly Pro Gly Arg Ala Phe Tyr Ala Thr Gly Asp He He 
305 310 315 320 

Gly Asp He Arg Gin Ala His Cys Asn He Ser Gly Glu Lys Trp Asn 
325 330 335 

Asn Thr Leu Lys Gin He Val Thr Lys Leu Gin Ala Gin Phe Gly Asn 
340 345 350 

Lys Thr He Val Phe Lys Gin Ser Ser Gly Gly Asp Pro Glu He Val 
355 360 365 

Met His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys Asn Ser Thr 
370 375 380 

Gin Leu Phe Asn Ser Thr Trp Asn Asn Thr He Gly Pro Asn Asn Thr 
385 390 395 400 

Asn Gly Thr lie Thr Leu Pro Cys Arg He Lys Gin He He Asn Arg 
405 410 415 

Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro He Arg Gly Gin 
420 425 430 

He Arg Cys Ser Ser Asn He Thr Gly Leu Leu Leu Thr Arg Asp Gly 
435 440 445 

Gly Lys Glu He Ser Asn Thr Thr Glu He Phe Arg Pro Gly Gly Gly 
450 455 460 

Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val 
465 470 475 480 

Lys He Glu Pro Leu Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val 
485 490 495 

Val Gin Arg Glu Lys Arg Ala Val Thr Leu Gly Ala Met Phe Leu Gly 
500 505 510 

Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Arg Ser Leu Thr Leu 
515 520 525 

Thr Val Gin Ala Arg Gin Leu Leu Ser Gly He Val Gin Gin Gin Asn 
530 535 540 

Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Leu Leu Gin Leu Thr 
545 550 555 560 



5 



WO 00/39303 



PCT/US99/31272 



Val Trp Gly lie Lys Gin Leu Gin Ala Arg Val Leu Ala Val Glu Arg 
565 570 575 

Tyr Leu Lys Asp Gin Gin Leu Leu Gly lie Trp Gly Cys Ser Gly Lys 
580 585 590 

Leu lie Cys Thr Thr Ala Val Pro Trp Asn Ala Ser Trp Ser Asn Lys 
595 600 605 

Ser Leu Asp Gin lie Trp Asn Asn Met Thr Trp Met Glu Trp Glu Arg 
610 615 620 

Glu lie Asp Asn Tyr Thr Asn Leu lie Tyr Thr Leu lie Glu Glu Ser 
625 630 635 640 

Gin Asn Gin Gin Glu Lys Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys 
645 650 655 

Trp Ala Ser Leu Trp Asn Trp Phe Asp lie Ser Lys Trp Leu Trp Tyr 
660 665 670 

lie Lys lie Phe lie Met lie Val Gly Gly Leu Val Gly Leu Arg lie 
675 680 685 

Val Phe Thr Val Leu Ser He Val Asn Arg Val Arg Gin Gly Tyr Ser 
690 695 700 

Pro Leu Ser Phe Gin Thr Arg Phe Pro Ala Pro Arg Gly Pro Asp Arg 
705 710 715 720 

Pro Glu Gly He Glu Glu Glu Gly Gly Glu Arg Asp Arg Asp Arg Ser 
725 730 735 

Ser Pro Leu Val His Gly Leu Leu Ala Leu He Trp Asp Asp Leu Arg 
740 745 750 

Ser Leu Cys Leu Phe Ser Tyr His Arg Leu Arg Asp Leu He Leu He 
755 760 765 

Ala Ala Arg He Val Glu Leu Leu Gly Arg Arg Gly Trp Glu Ala Leu 
770 775 780 

Lys Tyr Trp Gly Asn Leu Leu Gin Tyr Trp He Gin Glu Leu Lys Asn 
785 790 795 800 

Ser Ala Val Ser Leu Phe Asp Ala He Ala He Ala Val Ala Glu Gly 
805 810 815 

Thr Asp Arg He He Glu Val Ala Gin Arg He Gly Arg Ala Phe Leu 
820 825 830 



His He Pro Arg Arg He Arg Gin Gly Phe Glu Arg Ala Leu Leu 
835 840 845 



<210> 3 

<211> 2310 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Vall20-Ala204 
<400> 3 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 360 
ggcgcctgcc ccaaggtgag cttcgagccc atccccatcc actactgcgc ccccgccggc 420 
ttcgccatcc tgaagtgcaa cgacaagaag ttcaacggca gcggcccctg caccaacgtg 480 
agcaccgtgc agtgcaccca cggcatccgc cccgtggtga gcacccagct gctgctgaac 54 0 
ggcagcctgg ccgaggaggg cgtggtgatc cgcagcgaga acttcaccga caacgccaag 600 
accatcatcg tgcagctgaa ggagagcgtg gagatcaact gcacccgccc caacaacaac 660 
acccgcaaga gcatcaccat cggccccggc cgcgccttct acgccaccgg cgacatcatc 72 0 
ggcgacatcc gccaggccca ctgcaacatc agcggcgaga agtggaacaa caccctgaag 780 
cagatcgtga ccaagctgca ggcccagttc ggcaacaaga ccatcgtgtt caagcagagc 84 0 
agcggcggcg accccgagat cgtgatgcac agcttcaact gcggcggcga gttcttctac 900 
tgcaacagca cccagctgtt caacagcacc tggaacaaca ccatcggccc caacaacacc 960 
aacggcacca tcaccctgcc ctgccgcatc aagcagatca tcaaccgctg gcaggaggtg 1020 
ggcaaggcca tgtacgcccc ccccatccgc ggccagatcc gctgcagcag caacatcacc 1080 
ggcctgctgc tgacccgcga cggcggcaag gagatcagca acaccaccga gatcttccgc 114 0 
cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 1200 
aagatcgagc ccctgggcgt ggcccccacc aaggccaagc gccgcgtggt gcagcgcgag 1260 
aagcgcgccg tgaccctggg cgccatgttc ctgggcttcc tgggcgccgc cggcagcacc 1320 
atgggcgccc gcagcctgac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 13 80 
cagcagcaga acaacctgct gcgcgccatc gaggcccagc agcacctgct gcagctgacc 1440 
gtgtggggca tcaagcagct gcaggcccgc gtgctggccg tggagcgcta cctgaaggac 1500 
cagcagctgc tgggcatctg gggctgcagc ggcaagctga tctgcaccac cgccgtgccc 1560 
tggaacgcca gctggagcaa caagagcctg gaccagatct ggaacaacat gacctggatg 162 0 
gagtgggagc gcgagatcga caactacacc aacctgatct acaccctgat cgaggagagc 1680 
cagaaccagc aggagaagaa cgagcaggag ctgctggagc tggacaagtg ggccagcctg 174 0 
tggaactggt tcgacatcag caagtggctg tggtacatca agatcttcat catgatcgtg 1800 
ggcggcctgg tgggcctgcg catcgtgttc accgtgctga gcatcgtgaa ccgcgtgcgc 1860 
cagggctaca gccccctgag cttccagacc cgcttccccg ccccccgcgg ccccgaccgc 192 0 
cccgagggca tcgaggagga gggcggcgag cgcgaccgcg accgcagcag ccccctggtg 1980 
cacggcctgc tggccctgat ctgggacgac ctgcgcagcc tgtgcctgtt cagctaccac 204 0 
cgcctgcgcg acctgatcct gatcgccgcc cgcatcgtgg agctgctggg ccgccgcggc 2100 
tgggaggccc tgaagtactg gggcaacctg ctgcagtact ggatccagga gctgaagaac 2160 
agcgccgtga gcctgttcga cgccatcgcc atcgccgtgg ccgagggcac cgaccgcatc 2220 
atcgaggtgg cccagcgcat cggccgcgcc ttcctgcaca tcccccgccg catccgccag 22 80 
ggcttcgagc gcgccctgct gtaactcgag 2310 

<210> 4 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20- Ile201 
<400> 4 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
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atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagtcca acggcagcgg cccctgcacc 4 80 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 54 0 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 
atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 
ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 12 00 
gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 
cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 
agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 
atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 
ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 
aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 
gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 
tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 
gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 1740 
agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 
atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 
gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 
gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 
ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 
taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 
cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 
aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 
cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 
cgccagggct tcgagcgcgc cctgctgtaa ctcgag 2316 

<210> 5 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Ile201B 
<400> 5 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat 



gctgtgtgga 60 
ctacggcgtg 120 
ggcctacgac 180 
caacccccag 240 
catggtggag 300 
cgtgcccggc 360 
ctgcgccccc 420 
cccctgcacc 480 
ccagctgctg 540 
caccgacaac 600 
ccgccccaac 660 
caccggcgac 720 
gaacaacacc 780 
cgtgttcaag 84 0 
cggcgagttc 900 
cggccccaac 960 
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aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 

gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 

atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 114 0 

ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 

gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 

cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 

agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 

atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 

ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 

aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 

gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 

tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 

gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 174 0 

agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 

atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 

gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 

gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 

ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 204 0 

taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 

cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 

aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 222 0 

cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 

cgccagggct tcgagcgcgc cctgctgtaa ctcgagcgtg ct 2322 

<210> 6 
<211> 2328 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Lysl21 - Val200 
<400> 6 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaaggcc 360 
cccgtgatca cccaggcctg ccccaaggtg agcttcgagc ccatccccat ccactactgc 420 
gcccccgccg gcttcgccat cctgaagtgc aacgacaaga agttcaacgg cagcggcccc 4 80 
tgcaccaacg tgagcaccgt gcagtgcacc cacggcatcc gccccgtggt gagcacccag 540 
ctgctgctga acggcagcct ggccgaggag ggcgtggtga tccgcagcga gaacttcacc 600 
gacaacgcca agaccatcat cgtgcagctg aaggagagcg tggagatcaa ctgcacccgc 660 
cccaacaaca acacccgcaa gagcatcacc atcggccccg gccgcgcctt ctacgccacc 720 
ggcgacatca tcggcgacat ccgccaggcc cactgcaaca tcagcggcga gaagtggaac 780 
aacaccctga agcagatcgt gaccaagctg caggcccagt tcggcaacaa gaccatcgtg 840 
ttcaagcaga gcagcggcgg cgaccccgag atcgtgatgc acagcttcaa ctgcggcggc 900 
gagttcttct actgcaacag cacccagctg ttcaacagca cctggaacaa caccatcggc 960 
cccaacaaca ccaacggcac catcaccctg ccctgccgca tcaagcagat catcaaccgc 102 0 
tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 132 0 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 144 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
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accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 174 0 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 22 80 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg agcgtgct 2328 

<210> 7 
<211> 2334 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 
<400> 7 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 42 0 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 4 80 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 72 0 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 84 0 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgctggc aggaggtggg caaggccatg tacgcccccc ccatccgcgg ccagatccgc 1080 
tgcagcagca acatcaccgg cctgctgctg acccgcgacg gcggcaagga gatcagcaac 1140 
accaccgaga tcttccgccc cggcggcggc gacatgcgcg acaactggcg cagcgagctg 12 00 
tacaagtaca aggtggtgaa gatcgagccc ctgggcgtgg cccccaccaa ggccaagcgc 1260 
cgcgtggtgc agcgcgagaa gcgcgccgtg accctgggcg ccatgttcct gggcttcctg 1320 
ggcgccgccg gcagcaccat gggcgcccgc agcctgaccc tgaccgtgca ggcccgccag 1380 
ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 144 0 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcgt gctggccgtg 1500 
gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccaccg ccgtgccctg gaacgccagc tggagcaaca agagcctgga ccagatctgg 162 0 
aacaacatga cctggatgga gtgggagcgc gagatcgaca actacaccaa cctgatctac 16 80 
accctgatcg aggagagcca gaaccagcag gagaagaacg agcaggagct gctggagctg 1740 
gacaagtggg ccagcctgtg gaactggttc gacatcagca agtggctgtg gtacatcaag 1800 
atcttcatca tgatcgtggg cggcctggtg ggcctgcgca tcgtgttcac cgtgctgagc 1860 
atcgtgaacc gcgtgcgcca gggctacagc cccctgagct tccagacccg cttccccgcc 1920 
ccccgcggcc ccgaccgccc cgagggcatc gaggaggagg gcggcgagcg cgaccgcgac 1980 
cgcagcagcc ccctggtgca cggcctgctg gccctgatct gggacgacct gcgcagcctg 2040 
tgcctgttca gctaccaccg cctgcgcgac ctgatcctga tcgccgcccg catcgtggag 2100 
ctgctgggcc gccgcggctg ggaggccctg aagtactggg gcaacctgct gcagtactgg 2160 
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atccaggagc tgaagaacag cgccgtgagc ctgttcgacg ccatcgccat cgccgtggcc 2220 
gagggcaccg accgcatcat cgaggtggcc cagcgcatcg gccgcgcctt cctgcacatc 22B0 
ccccgccgca tccgccaggg cttcgagcgc gccctgctgt aactcgagcg tgct 2334 

<210> 8 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 
<400> 8 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
gccacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 4 80 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 84 0 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 
atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 
ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 
gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 
cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 
agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 13 80 
atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 14 40 
ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 
aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 
gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 
tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 
gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 174 0 
agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 
atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 
gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 
gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 
ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 
taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 
cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 
aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 
cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 
cgccagggct tcgagcgcgc cctgctgtaa ctcgag 2316 

<210> 9 

<211> 2541 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: Trp427 -Gly431 
<400> 9 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctgggg cggcaaggcc 1260 

atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 

ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 13 80 

ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 144 0 

cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 

gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 

cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 

aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 

atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 174 0 

ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 

agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 

cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 192 0 

caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 198 0 

ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 204 0 

gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 

agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 

atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 222 0 

ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 22 80 

gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 234 0 

ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 24 00 

agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 

gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 252 0 

cgcgccctgc tgtaactcga g 2541 

<210> 10 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426-Gly431 
<400> 10 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
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accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcgg cggcaaggcc 1260 
atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 132 0 
ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 13 80 
ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 
cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 
cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 
agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 
ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 
gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 234 0 
ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 
agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 
gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 



<210> 11 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Arg426-Gly4 31B 
<400> 11 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
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atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg cacccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcag cggcaaggcc 1260 
atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 
ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 
ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 
cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 
cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 1740 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 
agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 
ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 
gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 22 80 
gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 234 0 
ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 
agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 24 60 
gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 

<210> 12 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426 -Lys432 
<400> 12 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
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cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcgg caacaaggcc 1260 
atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 
ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 
ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 
cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 
cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 
agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 
ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 204 0 
gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 
ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 
agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 
gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 

<210> 13 
<211> 2535 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Asn425-Lys432 
<400> 13 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca acgcccccaa ggccatgtac 1260 
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gcccccccca tccgcggcca gatccgctgc agcagcaaca tcaccggcct gctgctgacc 1320 

cgcgacggcg gcaaggagat cagcaacacc accgagatct tccgccccgg cggcggcgac 13 80 

atgcgcgaca actggcgcag cgagctgtac aagtacaagg tggtgaagat cgagcccctg 1440 

ggcgtggccc ccaccaaggc caagcgccgc gtggtgcagc gcgagaagcg cgccgtgacc 1500 

ctgggcgcca tgttcctggg cttcctgggc gccgccggca gcaccatggg cgcccgcagc 1560 

ctgaccctga ccgtgcaggc ccgccagctg ctgagcggca tcgtgcagca gcagaacaac 162 0 

ctgctgcgcg ccatcgaggc ccagcagcac ctgctgcagc tgaccgtgtg gggcatcaag 1680 

cagctgcagg cccgcgtgct ggccgtggag cgctacctga aggaccagca gctgctgggc 1740 

atctggggct gcagcggcaa gctgatctgc accaccgccg tgccctggaa cgccagctgg 1800 

agcaacaaga gcctggacca gatctggaac aacatgacct ggatggagtg ggagcgcgag 1860 

atcgacaact acaccaacct gatctacacc ctgatcgagg agagccagaa ccagcaggag 192 0 

aagaacgagc aggagctgct ggagctggac aagtgggcca gcctgtggaa ctggttcgac 1980 

atcagcaagt ggctgtggta catcaagatc ttcatcatga tcgtgggcgg cctggtgggc 2040 

ctgcgcatcg tgttcaccgt gctgagcatc gtgaaccgcg tgcgccaggg ctacagcccc 2100 

ctgagcttcc agacccgctt ccccgccccc cgcggccccg accgccccga gggcatcgag 2160 

gaggagggcg gcgagcgcga ccgcgaccgc agcagccccc tggtgcacgg cctgctggcc 2220 

ctgatctggg acgacctgcg cagcctgtgc ctgttcagct accaccgcct gcgcgacctg 2280 

atcctgatcg ccgcccgcat cgtggagctg ctgggccgcc gcggctggga ggccctgaag 234 0 

tactggggca acctgctgca gtactggatc caggagctga agaacagcgc cgtgagcctg 24 00 

ttcgacgcca tcgccatcgc cgtggccgag ggcaccgacc gcatcatcga ggtggcccag 2460 

cgcatcggcc gcgccttcct gcacatcccc cgccgcatcc gccagggctt cgagcgcgcc 2520 

ctgctgtaac tcgag 2 535 

<210> 14 
<211> 2529 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Ile424 -Ala433 
<400> 14 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatcg gcggcgccat gtacgccccc 1260 

cccatccgcg gccagatccg ctgcagcagc aacatcaccg gcctgctgct gacccgcgac 1320 

ggcggcaagg agatcagcaa caccaccgag atcttccgcc ccggcggcgg cgacatgcgc 13 80 

gacaactggc gcagcgagct gtacaagtac aaggtggtga agatcgagcc cctgggcgtg 1440 

gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gaccctgggc 1500 

gccatgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgcccg cagcctgacc 1560 

ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagaa caacctgctg 1620 
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cgcgccatcg aggcccagca gcacctgctg cagctgaccg tgtggggcat caagcagctg 1680 

caggcccgcg tgctggccgt ggagcgctac ctgaaggacc agcagctgct gggcatctgg 1740 

ggctgcagcg gcaagctgat ctgcaccacc gccgtgccct ggaacgccag ctggagcaac 1800 

aagagcctgg accagatctg gaacaacatg acctggatgg agtgggagcg cgagatcgac 1860 

aactacacca acctgatcta caccctgatc gaggagagcc agaaccagca ggagaagaac 1920 

gagcaggagc tgctggagct ggacaagtgg gccagcctgt ggaactggtt cgacatcagc 1980 

aagtggctgt ggtacatcaa gatcttcatc atgatcgtgg gcggcctggt gggcctgcgc 2040 

atcgtgttca ccgtgctgag catcgtgaac cgcgtgcgcc agggctacag ccccctgagc 2100 

ttccagaccc gcttccccgc cccccgcggc cccgaccgcc ccgagggcat cgaggaggag 2160 

ggcggcgagc gcgaccgcga ccgcagcagc cccctggtgc acggcctgct ggccctgatc 2220 

tgggacgacc tgcgcagcct gtgcctgttc agctaccacc gcctgcgcga cctgatcctg 2280 

atcgccgccc gcatcgtgga gctgctgggc cgccgcggct gggaggccct gaagtactgg 2340 

ggcaacctgc tgcagtactg gatccaggag ctgaagaaca gcgccgtgag cctgttcgac 2400 

gccatcgcca tcgccgtggc cgagggcacc gaccgcatca tcgaggtggc ccagcgcatc 2460 

ggccgcgcct tcctgcacat cccccgccgc atccgccagg gcttcgagcg cgccctgctg 2520 

taactcgag 2529 

<210> 15 
<211> 2523 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Ile423 -Met434 
<400> 15 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcggcg gcatgtacgc cccccccatc 1260 
cgcggccaga tccgctgcag cagcaacatc accggcctgc tgctgacccg cgacggcggc 1320 
aaggagatca gcaacaccac cgagatcttc cgccccggcg gcggcgacat gcgcgacaac 1380 
tggcgcagcg agctgtacaa gtacaaggtg gtgaagatcg agcccctggg cgtggccccc 144 0 
accaaggcca agcgccgcgt ggtgcagcgc gagaagcgcg ccgtgaccct gggcgccatg 1500 
ttcctgggct tcctgggcgc cgccggcagc accatgggcg cccgcagcct gaccctgacc 1560 
gtgcaggccc gccagctgct gagcggcatc gtgcagcagc agaacaacct gctgcgcgcc 1620 
atcgaggccc agcagcacct gctgcagctg accgtgtggg gcatcaagca gctgcaggcc 1680 
cgcgtgctgg ccgtggagcg ctacctgaag gaccagcagc tgctgggcat ctggggctgc 174 0 
agcggcaagc tgatctgcac caccgccgtg ccctggaacg ccagctggag caacaagagc 1800 
ctggaccaga tctggaacaa catgacctgg atggagtggg agcgcgagat cgacaactac 1860 
accaacctga tctacaccct gatcgaggag agccagaacc agcaggagaa gaacgagcag 1920 
gagctgctgg agctggacaa gtgggccagc ctgtggaact ggttcgacat cagcaagtgg 1980 
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ctgtggtaca tcaagatctt catcatgatc gtgggcggcc tggtgggcct gcgcatcgtg 204 0 
ttcaccgtgc tgagcatcgt gaaccgcgtg cgccagggct acagccccct gagcttccag 2100 

acccgcttcc ccgccccccg cggccccgac cgccccgagg gcatcgagga ggagggcggc 2160 

gagcgcgacc gcgaccgcag cagccccctg gtgcacggcc tgctggccct gatctgggac 2220 

gacctgcgca gcctgtgcct gttcagctac caccgcctgc gcgacctgat cctgatcgcc 2280 

gcccgcatcg tggagctgct gggccgccgc ggctgggagg ccctgaagta ctggggcaac 2 34 0 

ctgctgcagt actggatcca ggagctgaag aacagcgccg tgagcctgtt cgacgccatc 2400 

gccatcgccg tggccgaggg caccgaccgc atcatcgagg tggcccagcg catcggccgc 24 60 

gccttcctgc acatcccccg ccgcatccgc cagggcttcg agcgcgccct gctgtaactc 2520 

gag 2523 

<210> 16 
<211> 2517 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Gln422 -Tyr435 
<400> 16 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 

ggcaccatca ccctgccctg ccgcatcaag cagggcggct acgccccccc catccgcggc 1260 

cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 132 0 

atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 13 8 0 

agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 144 0 

gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 

ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 

gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 162 0 

gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 168 0 

ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 174 0 

aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 

cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 

ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 192 0 

ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 198 0 

tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 204 0 

gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 

ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 

gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 

cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 

atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
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cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 2400 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 17 
<211> 2517 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Gln422 -Tyr435B 
<400> 17 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 7 80 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag caggccccct acgccccccc catccgcggc 1260 
cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 1320 
atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 1380 
agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 144 0 
gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 
ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 
gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 1620 
gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 1680 
ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 174 0 
aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 
cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 
ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 1920 
ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 1980 
tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 
ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 
gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 
cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 2400 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 18 
<211> 2322 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22-Serl99; 
Arg426-Gly431 

<400> 18 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 4 80 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgcggcg gcggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 174 0 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 198 0 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 19 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22-Serl99; 
Arg426-Lys432 

<400> 19 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
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cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 

tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 

ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 

acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 

ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 

acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 

gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 

tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 

atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 

ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 

atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 

aaccgcggcg gcaacaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 10 80 

agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 

gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 

tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 

gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 

gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 

agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 144 0 

ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 

tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 

accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 

atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 

atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 1740 

tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 

atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 

aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 

ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 

agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 

ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 

ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 

gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 

accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 

cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 20 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 ; 
Trp427-Gly431 

<400> 20 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 

tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 

ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 

acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 

ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
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acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgctggg gcggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 12 00 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 13 80 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 174 0 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 21 
<211> 2310 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Lysl21-Val200; 
Asn425-Lys4 32 

<400> 21 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaaggcc 360 
cccgtgatca cccaggcctg ccccaaggtg agcttcgagc ccatccccat ccactactgc 420 
gcccccgccg gcttcgccat cctgaagtgc aacgacaaga agttcaacgg cagcggcccc 4 80 
tgcaccaacg tgagcaccgt gcagtgcacc cacggcatcc gccccgtggt gagcacccag 540 
ctgctgctga acggcagcct ggccgaggag ggcgtggtga tccgcagcga gaacttcacc 600 
gacaacgcca agaccatcat cgtgcagctg aaggagagcg tggagatcaa ctgcacccgc 660 
cccaacaaca acacccgcaa gagcatcacc atcggccccg gccgcgcctt ctacgccacc 720 
ggcgacatca tcggcgacat ccgccaggcc cactgcaaca tcagcggcga gaagtggaac 7 80 
aacaccctga agcagatcgt gaccaagctg caggcccagt tcggcaacaa gaccatcgtg 840 
ttcaagcaga gcagcggcgg cgaccccgag atcgtgatgc acagcttcaa ctgcggcggc 900 
gagttcttct actgcaacag cacccagctg ttcaacagca cctggaacaa caccatcggc 960 
cccaacaaca ccaacggcac catcaccctg ccctgccgca tcaagcagat catcaacgcc 102 0 
cccaaggcca tgtacgcccc ccccatccgc ggccagatcc gctgcagcag caacatcacc 1080 
ggcctgctgc tgacccgcga cggcggcaag gagatcagca acaccaccga gatcttccgc 114 0 
cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 1200 
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aagatcgagc ccctgggcgt ggcccccacc aaggccaagc gccgcgtggt gcagcgcgag 1260 
aagcgcgccg tgaccctggg cgccatgttc ctgggcttcc tgggcgccgc cggcagcacc 1320 
atgggcgccc gcagcctgac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 13 80 
cagcagcaga acaacctgct gcgcgccatc gaggcccagc agcacctgct gcagctgacc 144 0 
gtgtggggca tcaagcagct gcaggcccgc gtgctggccg tggagcgcta cctgaaggac 1500 
cagcagctgc tgggcatctg gggctgcagc ggcaagctga tctgcaccac cgccgtgccc 1560 
tggaacgcca gctggagcaa caagagcctg gaccagatct ggaacaacat gacctggatg 1620 
gagtgggagc gcgagatcga caactacacc aacctgatct acaccctgat cgaggagagc 1680 
cagaaccagc aggagaagaa cgagcaggag ctgctggagc tggacaagtg ggccagcctg 174 0 
tggaactggt tcgacatcag caagtggctg tggtacatca agatcttcat catgatcgtg 1800 
ggcggcctgg tgggcctgcg catcgtgttc accgtgctga gcatcgtgaa ccgcgtgcgc 1860 
cagggctaca gccccctgag cttccagacc cgcttccccg ccccccgcgg ccccgaccgc 1920 
cccgagggca tcgaggagga gggcggcgag cgcgaccgcg accgcagcag ccccctggtg 1980 
cacggcctgc tggccctgat ctgggacgac ctgcgcagcc tgtgcctgtt cagctaccac 2040 
cgcctgcgcg acctgatcct gatcgccgcc cgcatcgtgg agctgctggg ccgccgcggc 2100 
tgggaggccc tgaagtactg gggcaacctg ctgcagtact ggatccagga gctgaagaac 2160 
agcgccgtga gcctgttcga cgccatcgcc atcgccgtgg ccgagggcac cgaccgcatc 2220 
atcgaggtgg cccagcgcat cggccgcgcc ttcctgcaca tcccccgccg catccgccag 2280 
ggcttcgagc gcgccctgct gtaactcgag 2310 



<210> 22 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vall20-Ile201; 
Ile424-Ala433 



<400> 22 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 60 0 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 84 0 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 114 0 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1200 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 132 0 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1380 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 144 0 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1560 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1620 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1740 
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gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 192 0 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 1980 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 204 0 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2220 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 
gccctgctgt aactcgag 2298 

<210> 23 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
Vall20-Ile201B; Ile424-Ala433 



<400> 23 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
atcacccagg cctgccccaa ggtgagcttc 
gccggcttcg ccatcctgaa gtgcaacgac 
aacgtgagca ccgtgcagtg cacccacggc 
ctgaacggca gcctggccga ggagggcgtg 
gccaagacca tcatcgtgca gctgaaggag 
aacaacaccc gcaagagcat caccatcggc 
atcatcggcg acatccgcca ggcccactgc 
ctgaagcaga tcgtgaccaa gctgcaggcc 
cagagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acagcaccca gctgttcaac 
aacaccaacg gcaccatcac cctgccctgc 
tacgcccccc ccatccgcgg ccagatccgc 
acccgcgacg gcggcaagga gatcagcaac 
gacatgcgcg acaactggcg cagcgagctg 
ctgggcgtgg cccccaccaa ggccaagcgc 
accctgggcg ccatgttcct gggcttcctg 
agcctgaccc tgaccgtgca ggcccgccag 
aacctgctgc gcgccatcga ggcccagcag 
aagcagctgc aggcccgcgt gctggccgtg 
ggcatctggg gctgcagcgg caagctgatc 
tggagcaaca agagcctgga ccagatctgg 
gagatcgaca actacaccaa cctgatctac 
gagaagaacg agcaggagct gctggagctg 
gacatcagca agtggctgtg gtacatcaag 
ggcctgcgca tcgtgttcac cgtgctgagc 
cccctgagct tccagacccg cttccccgcc 
gaggaggagg gcggcgagcg cgaccgcgac 
gccctgatct gggacgacct gcgcagcctg 
ctgatcctga tcgccgcccg catcgtggag 
aagtactggg gcaacctgct gcagtactgg 
ctgttcgacg ccatcgccat cgccgtggcc 
cagcgcatcg gccgcgcctt cctgcacatc 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgcccggc 360 
gagcccatcc ccatccacta ctgcgccccc 420 
aagaagttca acggcagcgg cccctgcacc 4 80 
atccgccccg tggtgagcac ccagctgctg 540 
gtgatccgca gcgagaactt caccgacaac 600 
agcgtggaga tcaactgcac ccgccccaac 660 
cccggccgcg ccttctacgc caccggcgac 720 
aacatcagcg gcgagaagtg gaacaacacc 780 
cagttcggca acaagaccat cgtgttcaag 840 
atgcacagct tcaactgcgg cggcgagttc 900 
agcacctgga acaacaccat cggccccaac 960 
cgcatcaagc agatcatcgg cggcgccatg 1020 
tgcagcagca acatcaccgg cctgctgctg 1080 
accaccgaga tcttccgccc cggcggcggc 1140 
tacaagtaca aggtggtgaa gatcgagccc 1200 
cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
ggcgccgccg gcagcaccat gggcgcccgc 132 0 
ctgctgagcg gcatcgtgca gcagcagaac 13 80 
cacctgctgc agctgaccgt gtggggcatc 1440 
gagcgctacc tgaaggacca gcagctgctg 1500 
tgcaccaccg ccgtgccctg gaacgccagc 1560 
aacaacatga cctggatgga gtgggagcgc 1620 
accctgatcg aggagagcca gaaccagcag 1680 
gacaagtggg ccagcctgtg gaactggttc 1740 
atcttcatca tgatcgtggg cggcctggtg 1800 
atcgtgaacc gcgtgcgcca gggctacagc 1860 
ccccgcggcc ccgaccgccc cgagggcatc 1920 
cgcagcagcc ccctggtgca cggcctgctg 1980 
tgcctgttca gctaccaccg cctgcgcgac 2040 
ctgctgggcc gccgcggctg ggaggccctg 2100 
atccaggagc tgaagaacag cgccgtgagc 2160 
gagggcaccg accgcatcat cgaggtggcc 2220 
ccccgccgca tccgccaggg cttcgagcgc 22 80 
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gccctgctgt aactcgag 2298 

<210> 24 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 ; 
Xle424-Ala433 

<400> 24 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
gccacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 84 0 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1140 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1200 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 132 0 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1380 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 144 0 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1560 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1620 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 174 0 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 192 0 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 1980 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2 040 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 222 0 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 
gccctgctgt aactcgag 2298 

<210> 25 
<211> 2358 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95 
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<400> 25 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 72 0 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 840 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
°tgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1320 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1380 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 144 0 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1620 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1860 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1920 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1980 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 2040 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2100 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2160 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2220 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2280 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2340 
gccctgctgt aactcgag 2358 

<210> 26 
<211> 2352 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95 ; 
Arg426-Gly431 

<400> 26 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 420 
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aaggtgagct tcgagcccat ccccatccac 
aagtgcaacg acaagaagtt caacggcagc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggagggcg tggtgatccg cagcgagaac 
cagctgaagg agagcgtgga gatcaactgc 
atcaccatcg gccccggccg cgccttctac 
caggcccact gcaacatcag cggcgagaag 
aagctgcagg cccagttcgg caacaagacc 
cccgagatcg tgatgcacag cttcaactgc 
cagctgttca acagcacctg gaacaacacc 
accctgccct gccgcatcaa gcagatcatc 
ccccccatcc gcggccagat ccgctgcagc 
gacggcggca aggagatcag caacaccacc 
cgcgacaact ggcgcagcga gctgtacaag 
gtggccccca ccaaggccaa gcgccgcgtg 
ggcgccatgt tcctgggctt cctgggcgcc 
accctgaccg tgcaggcccg ccagctgctg 
ctgcgcgcca tcgaggccca gcagcacctg 
ctgcaggccc gcgtgctggc cgtggagcgc 
tggggctgca gcggcaagct gatctgcacc 
aacaagagcc tggaccagat ctggaacaac 
gacaactaca ccaacctgat ctacaccctg 
aacgagcagg agctgctgga gctggacaag 
agcaagtggc tgtggtacat caagatcttc 
cgcatcgtgt tcaccgtgct gagcatcgtg 
agcttccaga cccgcttccc cgccccccgc 
gagggcggcg agcgcgaccg cgaccgcagc 
atctgggacg acctgcgcag cctgtgcctg 
ctgatcgccg cccgcatcgt ggagctgctg 
tggggcaacc tgctgcagta ctggatccag 
gacgccatcg ccatcgccgt ggccgagggc 
atcggccgcg ccttcctgca catcccccgc 
ctgtaactcg ag 



tactgcgccc ccgccggctt cgccatcctg 4 80 
ggcccctgca ccaacgtgag caccgtgcag 540 
acccagctgc tgctgaacgg cagcctggcc 600 
ttcaccgaca acgccaagac catcatcgtg 660 
acccgcccca acaacaacac ccgcaagagc 720 
gccaccggcg acatcatcgg cgacatccgc 780 
tggaacaaca ccctgaagca gatcgtgacc 840 
atcgtgttca agcagagcag cggcggcgac 900 
ggcggcgagt tcttctactg caacagcacc 960 
atcggcccca acaacaccaa cggcaccatc 1020 
aaccgcggcg gcggcaaggc catgtacgcc 1080 
agcaacatca ccggcctgct gctgacccgc 114 0 
gagatcttcc gccccggggg cggcgacatg 1200 
tacaaggtgg tgaagatcga gcccctgggc 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg 1320 
gccggcagca ccatgggcgc ccgcagcctg 13 80 
agcggcatcg tgcagcagca gaacaacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag 1500 
tacctgaagg accagcagct gctgggcatc 1560 
accgccgtgc cctggaacgc cagctggagc 1620 
atgacctgga tggagtggga gcgcgagatc 1680 
atcgaggaga gccagaacca gcaggagaag 174 0 
tgggccagcc tgtggaactg gttcgacatc 1800 
atcatgatcg tgggcggcct ggtgggcctg 1860 
aaccgcgtgc gccagggcta cagccccctg 1920 
ggccccgacc gccccgaggg catcgaggag 1980 
agccccctgg tgcacggcct gctggccctg 2040 
ttcagctacc accgcctgcg cgacctgatc 2100 
ggccgccgcg gctgggaggc cctgaagtac 2160 
gagctgaaga acagcgccgt gagcctgttc 2220 
accgaccgca tcatcgaggt ggcccagcgc 2280 
cgcatccgcc agggcttcga gcgcgccctg 2340 

2352 
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